Combating confirmation bias: a unified pseudo-labeling framework for entity alignment

Qijie Ding; Jie Yin; Daokun Zhang; Junbin Gao

doi:10.1007/s10618-025-01128-0

ScienceGate Book Chapters

JOURNAL ARTICLE

Combating confirmation bias: a unified pseudo-labeling framework for entity alignment

Qijie Ding Jie Yin Daokun Zhang Junbin Gao

Year: 2025 Journal: Data Mining and Knowledge Discovery Vol: 39 (5) Publisher: Springer Science+Business Media

DOI: 10.1007/s10618-025-01128-0

Get Full-Text PDF Get Analytical Report

Abstract

Abstract Entity alignment (EA) aims at identifying equivalent entity pairs across different knowledge graphs (KGs) that refer to the same real-world identity. It has been a compelling but challenging task that requires the integration of heterogeneous information from different KGs to expand the knowledge coverage and enhance inference abilities. To circumvent the shortage of prior seed alignments provided for training, recent EA models utilize pseudo-labeling strategies to iteratively add unaligned entity pairs predicted with high confidence to the seed alignments for model training. However, the adverse impact of confirmation bias during pseudo-labeling has been largely overlooked, thus hindering entity alignment performance. To systematically combat confirmation bias, we propose a new U nified P seudo- L abeling framework for E ntity A lignment (UPL-EA) that explicitly alleviates pseudo-labeling errors to boost the performance of entity alignment. UPL-EA achieves this goal through two key innovations: (1) Optimal Transport (OT)-based pseudo-labeling uses discrete OT modeling as an effective means to determine entity correspondences and reduce erroneous matches across two KGs. An effective criterion is derived to infer pseudo-labeled alignments that satisfy one-to-one correspondences; (2) Parallel pseudo-label ensembling refines pseudo-labeled alignments by combining predictions over multiple models independently trained in parallel. The ensembled pseudo-labeled alignments are thereafter used to augment seed alignments to reinforce subsequent model training for alignment inference. The effectiveness of UPL-EA in eliminating pseudo-labeling errors is both theoretically supported and experimentally validated. Our extensive results and in-depth analyses demonstrate the superiority of UPL-EA over 15 competitive baselines and its utility as a general pseudo-labeling framework for entity alignment.

Keywords:

Inference Computer science Economic shortage Task (project management) Artificial intelligence Natural language processing Orientation (vector space) Training set Machine learning Data mining Pattern recognition (psychology) Mathematics

Metrics

Cited By

9.64

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Graph Neural Networks

Physical Sciences → Computer Science → Artificial Intelligence

Data Quality and Management

Social Sciences → Decision Sciences → Management Science and Operations Research

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Combating confirmation bias: a unified pseudo-labeling framework for entity alignment

Abstract

Metrics

Citation History

Topics

Related Documents

Combating Confirmation Bias

Conflict-Aware Pseudo Labeling via Optimal Transport for Entity Alignment

Med Errors: Combating Confirmation Bias

A Unified Framework for Entity Alignment of Knowledge Graphs

From Alignment to Entailment: A Unified Textual Entailment Framework for Entity Alignment