Abstract

Organizations of all sizes are shifting their IT infrastructures to the cloud because of its cost efficiency and convenience. Because of the on-demand nature of the Infrastructure as a Service (IaaS) clouds, hundreds of thousands of virtual machines (VMs) may be deployed and terminated in a single large cloud data center each day. In this paper, we propose a content-based scheduling algorithm for the placement of VMs in data centers. We take advantage of the fact that it is possible to find identical disk blocks in different VM disk images with similar operating systems by scheduling VMs with high content similarity on the same hosts. That allows us to reduce the amount of data transferred when deploying a VM on a destination host. In this paper, we first present our study of content similarity between different VMs, based on a large set of VMs with different operating systems that represent the majority of popular operating systems in use today. Our analysis shows that content similarity between VMs with the same operating system and close version numbers (e.g., Ubuntu 12.04 vs. Ubuntu 11.10) can be as high as 60%. We also show that there is close to zero content similarity between VMs with different operating systems. Second, based on the above results, we designed a content-based scheduling algorithm that lowers the network traffic associated with transfer of VM disk images inside data centers. Our experimental results show that the amount of data transfer associated with deployment of VMs and transfer of virtual disk images can be lowered by more than 70%, resulting in significant savings in data center network utilization and congestion.

Keywords:
Computer science Cloud computing Scheduling (production processes) Virtual machine Operating system Data center Distributed computing Software deployment Computer network Engineering

Metrics

35
Cited By
4.71
FWCI (Field Weighted Citation Impact)
12
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Caching and Content Delivery
Physical Sciences →  Computer Science →  Computer Networks and Communications
Cloud Computing and Resource Management
Physical Sciences →  Computer Science →  Information Systems
Advanced Data Storage Technologies
Physical Sciences →  Computer Science →  Computer Networks and Communications

Related Documents

JOURNAL ARTICLE

VM Scheduling for Efficient Dynamically Migrated Virtual Machines (VMS-EDMVM) in Cloud Computing Environment

S SupreethKiran Kumari Patil

Journal:   KSII Transactions on Internet and Information Systems Year: 2022 Vol: 16 (6)
JOURNAL ARTICLE

Multiple Virtual Machines Resource Scheduling for Cloud Computing

Weizhe ZhangHui HeChen GuiJilong Sun

Journal:   Applied Mathematics & Information Sciences Year: 2013 Vol: 7 (5)Pages: 2089-2096
JOURNAL ARTICLE

Virtual Machines Scheduling Algorithm Based on Multi-Objective Optimization in Cloud Computing

Jian ZhuYi ZhuangJing LiWei Zhu

Journal:   Advanced materials research Year: 2014 Vol: 1046 Pages: 508-511
© 2026 ScienceGate Book Chapters — All rights reserved.