Identifying matching attributes across heterogeneous data sources is a critical and time-consuming step in integrating the data sources. In this paper, the author proposes a method for matching the most frequently encountered types of attributes across overlapping heterogeneous data sources. The author uses mutual information as a unified measure of dependence on various types of attributes. An example is used to demonstrate the utility of the proposed method, which is useful in developing practical attribute matching tools.
Patrick PantelAndrew PhilpotEduard Hovy
Chao KongMing GaoXu ChenWeining QianAoying Zhou
Yang YangYizhou SunJie TangBo MaJuanzi Li