JOURNAL ARTICLE

Interference and locality-aware task scheduling for MapReduce applications in virtual clusters

Abstract

MapReduce emerges as an important distributed programming paradigm for large-scale applications. Running MapReduce applications in clouds presents an attractive usage model for enterprises. In a virtual MapReduce cluster, the interference between virtual machines (VMs) causes performance degradation of map and reduce tasks and renders existing data locality-aware task scheduling policy, like delay scheduling, no longer effective. On the other hand, virtualization offers an extra opportunity of data locality for co-hosted VMs. In this paper, we present a task scheduling strategy to mitigate interference and meanwhile preserving task data locality for MapReduce applications. The strategy includes an interference-aware scheduling policy, based on a task performance prediction model, and an adaptive delay scheduling algorithm for data locality improvement. We implement the interference and locality-aware (ILA) scheduling strategy in a virtual MapReduce framework. We evaluated its effectiveness and efficiency on a 72-node Xen-based virtual cluster. Experimental results with 10 representative CPU and IO-intensive applications show that ILA is able to achieve a speedup of 1.5 to 6.5 times for individual jobs and yield an improvement of up to 1.9 times in system throughput in comparison with four other MapReduce schedulers.

Keywords:
Computer science Locality Scheduling (production processes) Distributed computing Virtual machine Virtualization Parallel computing GPU cluster Speedup Cloud computing Operating system CUDA

Metrics

99
Cited By
40.04
FWCI (Field Weighted Citation Impact)
36
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Cloud Computing and Resource Management
Physical Sciences →  Computer Science →  Information Systems
IoT and Edge/Fog Computing
Physical Sciences →  Computer Science →  Computer Networks and Communications
Distributed and Parallel Computing Systems
Physical Sciences →  Computer Science →  Computer Networks and Communications
© 2026 ScienceGate Book Chapters — All rights reserved.