JOURNAL ARTICLE

Mapping Hierarchical File Structures to Semantic Data Models for Efficient Data Integration into Research Data Management Systems

Henrik tom WördenFlorian SpreckelsenStefan LutherUlrich ParlitzAlexander Schlemmer

Year: 2024 Journal:   Data Vol: 9 (2)Pages: 24-24   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Although other methods exist to store and manage data in modern information technology, the standard solution is file systems. Therefore, keeping well-organized file structures and file system layouts can be key to a sustainable research data management infrastructure. However, file structures alone lack several important capabilities for FAIR data management: the two most significant being insufficient visualization of data and inadequate possibilities for searching and obtaining an overview. Research data management systems (RDMSs) can fill this gap, but many do not support the simultaneous use of the file system and RDMS. This simultaneous use can have many benefits, but keeping data in RDMS in synchrony with the file structure is challenging. Here, we present concepts that allow for keeping file structures and semantic data models (in RDMS) synchronous. Furthermore, we propose a specification in yaml format that allows for a structured and extensible declaration and implementation of a mapping between the file system and data models used in semantic research data management. Implementing these concepts will facilitate the re-use of specifications for multiple use cases. Furthermore, the specification can serve as a machine-readable and, at the same time, human-readable documentation of specific file system structures. We demonstrate our work using the Open Source RDMS LinkAhead (previously named “CaosDB”).

Keywords:
Computer science Data mapping Data management Data integration Semantic integration Information retrieval IDEF1X Data science Data mining Database Semantic Web Semantic computing Ontology-based data integration

Metrics

2
Cited By
4.68
FWCI (Field Weighted Citation Impact)
23
Refs
0.90
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Scientific Computing and Data Management
Social Sciences →  Decision Sciences →  Information Systems and Management
Research Data Management Practices
Physical Sciences →  Computer Science →  Information Systems
Data Quality and Management
Social Sciences →  Decision Sciences →  Management Science and Operations Research
© 2026 ScienceGate Book Chapters — All rights reserved.