Hamid Reza FaragardiReza ShojaeeMaziar Mirzazad-BarijoughRoozbeh Nosrati
A real-time parallel application can be divided into a number of tasks and executed concurrently on distinct nodes of a Distributed System (DS). Distributed System Reliability (DSR) can be defined as the probability that all the tasks in the system run successfully. Due to different hazard rates of nodes and links, DSR critically depends on the optimal allocation of these tasks onto the available nodes. In this paper, we have presented a mathematical model for analyzing DSR in a DS on which hard real-time periodic tasks are executed. In addition, to maximize reliability besides satisfying the constraints, we have proposed an offline task allocation algorithm. The algorithm is a new swarm intelligence approach based on Ant Colony Optimization (ACO). For evaluating the algorithm, ACO is compared with Honey Bee Mating Optimization (HBMO) and Particle Swarm Optimization (PSO). Simulation results manifest that ACO produces better solutions than PSO and HBMO. Meanwhile, it leads to shorter execution time. The results also reveal the flexibility and scalability of the proposed algorithm.
Ghasem S. AlijaniHorst F. Wedde