DISSERTATION

Semantic and Efficient Symbolic Learning over Knowledge Graphs

Abstract

Knowledge Graphs (KGs) are structured representations designed to unify heterogeneous data and domain-specific semantics into a coherent and machine-interpretable format. By capturing entities, their attributes, and interrelations through labeled triples, KGs serve as powerful tools for integrating factual content with domain knowledge across diverse sources. In real-world applications, KGs are typically constructed under the Open World Assumption (OWA), a foundational semantic principle stating that the absence of a fact should not be interpreted as evidence of its falsehood, but rather as an indication of incomplete knowledge. The adoption of OWA enables flexibility and scalability in KG design but also introduces inherent incompleteness. KGs are frequently built through automated extraction pipelines that draw from unstructured text, semi-structured records, or disparate databases. As a result, many true facts may remain unstated, either due to source limitations or incomplete mappings. This incompleteness presents significant obstacles for downstream tasks such as learning, inference, and reasoning, which typically rely on observed data patterns. Accordingly, managing and mitigating incompleteness in KGs is a central challenge in ensuring accurate knowledge representation and effective reasoning in intelligent systems. Knowledge Graph Completion (KGC) addresses the problem of inferring missing facts in KG by identifying latent patterns and exploiting the underlying semantic structure of the data. To tackle KG incompleteness, inductive learning methods—both symbolic and numerical—are typically employed to generalize from observed data and predict plausible yet unrecorded triples. However, findings from this thesis reveal that these approaches struggle when applied to KGs that suffer from structural anomalies, semantic inconsistencies, or interoperability issues. Empirical results demonstrate that numerical models often overfit to spurious correlations introduced by noisy data, while symbolic learners may produce rules that contradict domain knowledge when not guided by semantic constraints. These observations underscore a fundamental limitation in current approaches: their inability to validate and semantically align predictions. To address these shortcomings, this thesis establishes the necessity of a unified framework that integrates domain semantics, symbolic reasoning, and formal validation mechanisms—thereby ensuring that inferred knowledge is not only statistically plausible but also semantically sound and consistent with the KG’s intended meaning. The thesis proposes a knowledge-driven framework that enhances KGC through the integration of ontological reasoning, structural normalization, constraint validation, and neuro-symbolic learning. At the core of this framework is the use of ontologies as formal semantic backbones, encoding domain-specific hierarchies, constraints, and logical relationships. These ontological structures are leveraged in symbolic learning via entailment regimes and heuristics such as the Partial Completeness Assumption (PCA), enabling the inference of implicit, meaningful relationships. To further enhance structural integrity, the framework introduces a novel normalization theory for KGs, combined with SHACL-based validation, to resolve anomalies such as blank nodes, overloaded properties, and conflicting assignments. Experimental findings confirm that these normalization and validation steps significantly improve both the quality and interpretability of completed KGs. Building on this foundation, the thesis presents a neuro-symbolic learning architecture that combines the generalization power of neural embeddings with the rule-based transparency of symbolic reasoning. This hybrid model improves predictive performance, enforces semantic alignment, and supports explainability, outperforming traditional approaches across several benchmarks. Collectively, these contributions provide concrete responses to six research questions and establish neuro-symbolic KGC as an effective, scalable, and trustworthy paradigm for the development of semantically grounded AI systems.

Keywords:
Semantics (computer science) Knowledge graph Knowledge representation and reasoning Scalability Semantic Web Representation (politics) Flexibility (engineering) Domain knowledge Graph

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Graph Neural Networks
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Graph Theory and Algorithms
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

BOOK-CHAPTER

Semantic and Efficient Symbolic Learning over Knowledge Graphs

Disha Purohit

Lecture notes in computer science Year: 2023 Pages: 244-254
JOURNAL ARTICLE

Efficient Symbolic Learning over Knowledge Graphs

Deshar, Sohan

Journal:   Institutional Repository of Leibniz Universität Hannover (Leibniz Universität Hannover) Year: 2024
BOOK-CHAPTER

Neuro-Symbolic AI for Conflict-Aware Learning over Knowledge Graphs

Laura Balbi

Lecture notes in computer science Year: 2025 Pages: 276-287
JOURNAL ARTICLE

Efficient semantic summary graphs for querying large knowledge graphs

Emetis NiazmandGëzim SejdiuDamien GrauxMaría-Esther Vidal

Journal:   International Journal of Information Management Data Insights Year: 2022 Vol: 2 (1)Pages: 100082-100082
BOOK-CHAPTER

Neuro-Symbolic Adaptive Query Processing over Knowledge Graphs

Chang QinMaribel Acosta

Lecture notes in computer science Year: 2025 Pages: 253-270
© 2026 ScienceGate Book Chapters — All rights reserved.