Resource performance fluctuation and resource failure become two main factors that affect task execution in cloud systems, especially for deadline-constrained workflow instances with precedence relationships among tasks. Lots of cloud workflow scheduling algorithms have been designed for fault tolerance or performance fluctuation separately while less work consider these two issues simultaneously. In this paper, an algorithm named FCWSU is proposed to fault-tolerant scheduling workflows with uncertain task execution time caused by resource performance fluctuation in clouds. A novel workflow scheduling architecture is designed in FCWSU to mitigate the delay propagation caused by either performance fluctuation or failure of VMs. A PB (Primary-Backup) model based scheduling algorithm is proposed for cloud resource failure tolerance. Experiment results show that FCWSU can provide better scheduling strategy for deadline-constrained workflows than corresponding competitors.
Zhongjin LiJiacheng YuHaiyang HuJie ChenHua HuJidong GeVictor Chang
Jiagang LiuJu RenWei DaiDeyu ZhangPude ZhouYaoxue ZhangGeyong MinNoushin Najjari