Fault-tolerance is one of the main problems that must be resolved to improve the adoption of the agent's computing paradigm. In this paper, we develop a pragmatic framework for agent systems fault-tolerance. The developed framework deploys an independent checkpointing strategy with cooperating agent and passive replication to offer a low-cost, application-transparent model for reliable agent-based computing that covers all possible faults that might invalidate reliable agent execution, migration and communication and maintains the exactly-once and non-blocking properties. At the end, we will present some performance results that show the effectiveness of the proposed fault-tolerance scheme.
Jin YangJiannong CaoWeigang WuCheng Xu
Jin YangJiannong CaoWeigang WuChengzhong Xu