JOURNAL ARTICLE

Data Driven Priority Scheduling on Spark Based Stream Processing

Abstract

This paper focuses on priority based processing of streaming data. One of the greatest challenges in big data analytics is responding to a bursty input load. The common solutions are to use dynamic resource provisioning techniques, however, these techniques may not respond quickly enough to the change in the load. Another option is to overprovision, but this results in wasted computing resources. This paper describes a technique that can be used in cases where resources are statically provisioned. This technique enables users to prioritize certain input data items so that in cases where the load suddenly increases, the high priority items are given precedence over low priority items. This technique is implemented on the Spark Streaming engine.

Keywords:
Provisioning Computer science Stream processing SPARK (programming language) Scheduling (production processes) Big data Distributed computing Analytics Real-time computing Resource (disambiguation) Computer network Database Operating system Engineering

Metrics

8
Cited By
1.34
FWCI (Field Weighted Citation Impact)
7
Refs
0.86
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Cloud Computing and Resource Management
Physical Sciences →  Computer Science →  Information Systems
Data Stream Mining Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Database Systems and Queries
Physical Sciences →  Computer Science →  Computer Networks and Communications
© 2026 ScienceGate Book Chapters — All rights reserved.