JOURNAL ARTICLE

Pipelined Online Data Insertion for Erasure Codes in Distributed Storage Systems

Abstract

Distributed storage systems usually adopt erasure codes to ensure the data reliability due to the better space efficiency and higher reliability. However, existing data insertion schemes for erasure codes are not appropriate for the continuous online data due to bottleneck of the centralized insertion schemes and the low efficiency of the decentralized insertion schemes. In this paper, we propose a pipelined online data insertion scheme based on the distributed storage systems with erasure codes, called POIS. For efficiency, we propose a distance-aware node selection technique to improve the transmission efficiency by selecting the nodes with the higher available bandwidth. Moreover, we propose a distributed data processing technique to maximize the encoding efficiency by pipelining the data transmission and distributing the encoding operations. For adaptivity, we propose a rollback-based failure processing technique to handle the node failure during the insertion process. To evaluate the performance of POIS, we conduct experiments on HDFS-RAID under various parameter settings on 200 virtual machines. Extensive experiments confirm that POIS improves the insertion throughput, adaptively adjusts the insertion process by handling the node failure during insertion and significantly outperforms the state-of-the art approaches under various parameter settings.

Keywords:
Computer science Erasure code Bottleneck Erasure Distributed data store Node (physics) Throughput Distributed computing Reliability (semiconductor) Data transmission Computer network Decoding methods Embedded system Algorithm Operating system Power (physics)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
24
Refs
0.19
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Data Storage Technologies
Physical Sciences →  Computer Science →  Computer Networks and Communications
Caching and Content Delivery
Physical Sciences →  Computer Science →  Computer Networks and Communications
Peer-to-Peer Network Technologies
Physical Sciences →  Computer Science →  Computer Networks and Communications
© 2026 ScienceGate Book Chapters — All rights reserved.