Abstract

Fault-tolerant protocols such as group multicast and membership are important abstractions that help simplify the development of distributed operating systems with fault-tolerance requirements. Unfortunately, these protocols are often very complicated in their own right, which makes their design and implementation a non-trivial task. If the power of these abstractions is to be effectively utilized in future systems, further efforts to improve our understanding of these protocols and their fundamental properties are needed.Our current research is addressing these issues by applying modularization techniques to fault-tolerant protocols. Our approach is based on identifying orthogonal properties of a given protocol, and then realizing these properties as separate modules within a standard system framework. For example, group multicast protocols are often defined to be some combination of reliability, atomicity, and consistent ordering; reliability is the property that a given message is always delivered, atomicity that either all group members or no group member receives the message, and consistent ordering that all group members see the same (causal or total) order of messages. Our goal is to develop a new model for fault-tolerant protocols that will facilitate such modularization, and hence, make it easier to understand both the properties themselves and their inherent dependencies. This research builds on previous work involving Psync [Pete89] and Consul [Mish91, Mish92].

Keywords:
Computer science Fault tolerance Programming language Distributed computing

Metrics

3
Cited By
0.53
FWCI (Field Weighted Citation Impact)
8
Refs
0.68
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Distributed systems and fault tolerance
Physical Sciences →  Computer Science →  Computer Networks and Communications
Software System Performance and Reliability
Physical Sciences →  Computer Science →  Computer Networks and Communications
Cloud Computing and Resource Management
Physical Sciences →  Computer Science →  Information Systems

Related Documents

JOURNAL ARTICLE

Fault tolerant commit protocols

Shyan‐Ming YuanPankaj Jalote

Year: 2003 Pages: 280-286
BOOK-CHAPTER

Fault-Tolerant Consensus Protocols

Mohammad Sadoghi

SpringerBriefs in philosophy Year: 2025 Pages: 61-85
JOURNAL ARTICLE

Fault-tolerant decentralized commit protocols

Shyan‐Ming YuanAshok K. Agrawala

Journal:   Journal of Parallel and Distributed Computing Year: 1991 Vol: 13 (3)Pages: 299-311
JOURNAL ARTICLE

Protocols for fault-tolerant systems

Vinit Kumar

Year: 2005 Vol: 3 Pages: 194-198
© 2026 ScienceGate Book Chapters — All rights reserved.