Liyi ChenZhi LiTong XuHan WuZhefeng WangNicholas Jing YuanEnhong Chen
The booming of multi-modal knowledge graphs (MMKGs) has raised the imperative demand for multi-modal entity alignment techniques, which facilitate the integration of multiple MMKGs from separate data sources. Unfortunately, prior arts harness multi-modal knowledge only via the heuristic merging of uni-modal feature embeddings. Therefore, inter-modal cues concealed in multi-modal knowledge could be largely ignored. To deal with that problem, in this paper, we propose a novel Multi-modal Siamese Network for Entity Alignment (MSNEA) to align entities in different MMKGs, in which multi-modal knowledge could be comprehensively leveraged by the exploitation of inter-modal effect. Specifically, we first devise a multi-modal knowledge embedding module to extract visual, relational, and attribute features of entities to generate holistic entity representations for distinct MMKGs. During this procedure, we employ inter-modal enhancement mechanisms to integrate visual features to guide relational feature learning and adaptively assign attention weights to capture valuable attributes for alignment. Afterwards, we design a multi-modal contrastive learning module to achieve inter-modal enhancement fusion with avoiding the overwhelming impact of weak modalities. Experimental results on two public datasets demonstrate that our proposed MSNEA provides state-of-the-art performance with a large margin compared with competitive baselines.
Wenxin NiQianqian XuYangbangyan JiangZongsheng CaoXiaochun CaoQingming Huang
Yuqiong YouYuyang WeiYanlong ZhangWei ChenLei Zhao
Jinxu LiQian ZhouWei ChenLei Zhao
Taoyu SuXinghua ZhangJiawei ShengZhenyu ZhangTingwen Liu
Hao GuoJiuyang TangWeixin ZengXiang ZhaoLi Liu