🤖 AI Summary
Classical fault-tolerance theory models only process failures, failing to capture the impact of communication channel failures. Method: This paper introduces the first systematic characterization of implementability boundaries for atomic registers, atomic snapshots, lattice agreement, and consensus under arbitrary mixed process–channel failure models. We propose a generalized quorum framework replacing classical bidirectional connectivity with unidirectional reachability, and design a request-response–independent logical clock mechanism enabling asynchronous state synchronization. Contribution/Results: We establish necessary and sufficient conditions for implementing these primitives under channel failures and derive the first tight fault-tolerance threshold—namely, an exact relationship between the maximum tolerable number of faulty channels and network topology. Our results transcend the traditional process-only failure paradigm, providing a foundational theoretical basis for designing strongly consistent protocols in highly dynamic, low-reliability networks.
📝 Abstract
Communication channel failures are a major concern for the developers of modern fault-tolerant systems. However, while tight bounds for process failures are well-established, extending them to include channel failures has remained an open problem. We introduce emph{generalized quorum systems} - a framework that characterizes the necessary and sufficient conditions for implementing atomic registers, atomic snapshots, lattice agreement and consensus under arbitrary patterns of process-channel failures. Generalized quorum systems relax the connectivity constraints of classical quorum systems: instead of requiring bidirectional reachability for every pair of write and read quorums, they only require some write quorum to be emph{unidirectionally} reachable from some read quorum. This weak connectivity makes implementing registers particularly challenging, because it precludes the traditional request/response pattern of quorum access, making classical solutions like ABD inapplicable. To address this, we introduce novel logical clocks that allow write and read quorums to reliably track state updates without relying on bi-directional connectivity.