JOURNAL ARTICLE

HGrid: A Data Model for Large Geospatial Data Sets in HBase

Abstract

Cloud-based infrastructures enable applications to collect and analyze massive amounts of data. Whether these applications are newly developed or they are being evolved from existing RDBMS-based implementations, NoSQL databases offer an attractive platform with which to address this challenge. However, developers find it difficult to effectively manage data in NoSQL databases, because these platforms do not offer much support for data organization. Since poor data organization may abuse the features of the NoSQL database and result in unsatisfactory performance, developing a systematic method for NoSQL database data-schema design is a timely and important problem. In this paper, we focus on geospatial applications, as a family of big-data systems with distinct data types and usage patterns, in need of scalability. We propose the HGrid data model for HBase, based on a hybrid index structure, combining a quad-tree and a regular grid as primary and secondary indices correspondingly. We have comparatively evaluated the performance of HGrid with uniform and skewed data, against two other data models based on quad-tree and regular-grid indices. Our results demonstrate that HGrid scales well and supports efficient performance for range and k-nearest neighbor queries. Although this model does not outperform all its competitors in terms of query response time, it is more flexible for discontinuous and skewed space, and its index requires less space than the corresponding quad-tree and regular-grid indices, which makes its deployment possible with less resources. Through this study, we also formulate a set of guidelines on how to organize data for geospatial applications in HBase.

Keywords:
NoSQL Computer science Scalability Geospatial analysis Cloud computing Data mining Grid Database Big data Tree (set theory)

Metrics

45
Cited By
2.66
FWCI (Field Weighted Citation Impact)
14
Refs
0.91
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Data Management and Algorithms
Physical Sciences →  Computer Science →  Signal Processing
Graph Theory and Algorithms
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Database Systems and Queries
Physical Sciences →  Computer Science →  Computer Networks and Communications

Related Documents

JOURNAL ARTICLE

Keynote: Algorithms for Large Geospatial Data Sets

Funke, Stefan

Journal:   Thüringer Universitäts- und Landesbibliothek Year: 2025
JOURNAL ARTICLE

Visual Data Mining in Large Geospatial Point Sets

Daniel A. KeimChristian PanseMike SipsStephen C. North

Journal:   IEEE Computer Graphics and Applications Year: 2004 Vol: 24 (5)Pages: 36-44
JOURNAL ARTICLE

Hadoop-HBase for large-scale data

Mehul Nalin Vora

Year: 2011 Pages: 601-605
BOOK-CHAPTER

Fast SNN-Based Clustering Approach for Large Geospatial Data Sets

Arménio AntunesMaribel Yasmina SantosAdriano Moreira

Lecture notes in geoinformation and cartography Year: 2014 Pages: 179-195
© 2026 ScienceGate Book Chapters — All rights reserved.