Network-aware scheduling of mapreduce framework ondistributed clusters over high speed networks

Praveenkumar Kondikoppa; Chui‐Hui Chiu; Cheng Cui; Lin Xue; Seung‐Jong Park

doi:10.1145/2378975.2378985

ScienceGate Book Chapters

JOURNAL ARTICLE

Network-aware scheduling of mapreduce framework ondistributed clusters over high speed networks

Praveenkumar Kondikoppa Chui‐Hui Chiu Cheng Cui Lin Xue Seung‐Jong Park

Year: 2012 Pages: 39-44

DOI: 10.1145/2378975.2378985

Get Full-Text PDF Get Analytical Report

Abstract

Google's MapReduce has gained significant popularity as a platform for large scale distributed data processing. Hadoop [1] is an open source implementation of MapReduce [11] framework, originally it was developed to operate over single cluster environment and could not be leveraged for distributed data processing across federated clusters. At multiple federated clusters connected with high speed networks, computing resources are provisioned from any of the clusters from the federation. Placement of map tasks close to its data split is critical for performance of Hadoop. In this work, we add network awareness in Hadoop while scheduling the map tasks over federated clusters. We observe 12% to 15 % reduction of execution time in FIFO and FAIR schedulers of Hadoop for varying workloads.

Keywords:

Computer science Scheduling (production processes) Provisioning Distributed computing Distributed database Big data Cluster (spacecraft) Parallel computing Operating system Computer network

Metrics

Cited By

8.37

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Cloud Computing and Resource Management

Physical Sciences → Computer Science → Information Systems

Data Stream Mining Techniques

Physical Sciences → Computer Science → Artificial Intelligence

IoT and Edge/Fog Computing

Physical Sciences → Computer Science → Computer Networks and Communications

Network-aware scheduling of mapreduce framework ondistributed clusters over high speed networks

Abstract

Metrics

Citation History

Topics

Related Documents

Resource-Aware Adaptive Scheduling for MapReduce Clusters

Availability and Network-Aware MapReduce Task Scheduling over the Internet

Thermal-Aware Job Scheduling of MapReduce Applications on High Performance Clusters

Phurti: Application and Network-Aware Flow Scheduling for Multi-tenant MapReduce Clusters

Job Aware Scheduling Algorithm for MapReduce Framework