Optimized Speculative Execution Strategy for Different Workload Levels in Heterogeneous Spark Cluster

Xiaohan Huang; Chunlin Li; Youlong Luo

doi:10.1145/3335484.3335493

ScienceGate Book Chapters

JOURNAL ARTICLE

Optimized Speculative Execution Strategy for Different Workload Levels in Heterogeneous Spark Cluster

Xiaohan Huang Chunlin Li Youlong Luo

Year: 2019 Pages: 6-10

DOI: 10.1145/3335484.3335493

Get Full-Text PDF Get Analytical Report

Abstract

Spark is a big data processing framework based on MapReduce, whose calculation model requires that all tasks in all parent stages are completed before starting a new stage. Machine service variability or congested network connections caused by partial or intermittent machine failures become a bottleneck for the Spark framework to execute tasks. In this paper, we focus on the design of speculative execution schemes for heterogeneous Spark from an optimization perspective on different loading conditions. First, we derive the load arrival rate threshold for different operating regimes. Second, for the lightly loaded case, we analyze and propose the speculative execution based on task-cloning algorithm (SETC) which reduce the application completion time by maximizing the overall system utility. Then, for the heavily loaded case, we propose the speculative execution based on straggler-detection algorithm(SESD), which aims to mitigate stragglers. Finally, we conduct experiments to verify the performance of SETC and SESD. Results show that our method is faster than Spark-Speculation, LATE, and SCA by16.7%, 8.2%, and 11.7%. Also it outperforms the baseline algorithms in some metric aspect such as the cluster throughput.

Keywords:

SPARK (programming language) Workload Computer science Cluster (spacecraft) Speculative multithreading Parallel computing Operating system Multithreading Programming language

Metrics

Cited By

1.46

FWCI (Field Weighted Citation Impact)

Refs

0.86

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Cloud Computing and Resource Management

Physical Sciences → Computer Science → Information Systems

Distributed and Parallel Computing Systems

Physical Sciences → Computer Science → Computer Networks and Communications

IoT and Edge/Fog Computing

Physical Sciences → Computer Science → Computer Networks and Communications

Optimized Speculative Execution Strategy for Different Workload Levels in Heterogeneous Spark Cluster

Abstract

Metrics

Citation History

Topics

Related Documents

An Improved Speculative Strategy for Heterogeneous Spark Cluster

An Optimized Strategy for Speculative Execution in a Heterogeneous Environment

Optimizing Speculative Execution in Spark Heterogeneous Environments

Big Data Cluster Processing Through Optimized Speculative Execution

An Optimized Speculative Execution Strategy Based on Local Data Prediction in a Heterogeneous Hadoop Environment