JOURNAL ARTICLE

Optimized Speculative Execution Strategy for Different Workload Levels in Heterogeneous Spark Cluster

Abstract

Spark is a big data processing framework based on MapReduce, whose calculation model requires that all tasks in all parent stages are completed before starting a new stage. Machine service variability or congested network connections caused by partial or intermittent machine failures become a bottleneck for the Spark framework to execute tasks. In this paper, we focus on the design of speculative execution schemes for heterogeneous Spark from an optimization perspective on different loading conditions. First, we derive the load arrival rate threshold for different operating regimes. Second, for the lightly loaded case, we analyze and propose the speculative execution based on task-cloning algorithm (SETC) which reduce the application completion time by maximizing the overall system utility. Then, for the heavily loaded case, we propose the speculative execution based on straggler-detection algorithm(SESD), which aims to mitigate stragglers. Finally, we conduct experiments to verify the performance of SETC and SESD. Results show that our method is faster than Spark-Speculation, LATE, and SCA by16.7%, 8.2%, and 11.7%. Also it outperforms the baseline algorithms in some metric aspect such as the cluster throughput.

Keywords:
SPARK (programming language) Workload Computer science Cluster (spacecraft) Speculative multithreading Parallel computing Operating system Multithreading Programming language

Metrics

5
Cited By
1.46
FWCI (Field Weighted Citation Impact)
18
Refs
0.86
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Cloud Computing and Resource Management
Physical Sciences →  Computer Science →  Information Systems
Distributed and Parallel Computing Systems
Physical Sciences →  Computer Science →  Computer Networks and Communications
IoT and Edge/Fog Computing
Physical Sciences →  Computer Science →  Computer Networks and Communications

Related Documents

JOURNAL ARTICLE

An Improved Speculative Strategy for Heterogeneous Spark Cluster

Pengfei ZhangZonghuai Guo

Journal:   MATEC Web of Conferences Year: 2018 Vol: 173 Pages: 01018-01018
JOURNAL ARTICLE

Optimizing Speculative Execution in Spark Heterogeneous Environments

Zhongming FuZhuo Tang

Journal:   IEEE Transactions on Cloud Computing Year: 2019 Vol: 10 (1)Pages: 568-582
JOURNAL ARTICLE

Big Data Cluster Processing Through Optimized Speculative Execution

D. Sasi Redkha

Journal:   INTERNATIONAL JOURNAL OF EMERGING TRENDS IN SCIENCE AND TECHNOLOGY Year: 2017 Vol: 4 (9)
© 2026 ScienceGate Book Chapters — All rights reserved.