Traditional multi-view methods often employ neural networks to extract features for clustering tasks. However, the obtained features are typically coarse-grained descriptions of multi-view data, and their discriminating power is limited. Fine-grained information is capable of providing more comprehensive data descriptions and detailed information. As a result, exploiting fine-grained features of multi-view data is of great significance for multi-view clustering. In this paper, we propose a self-attention-enhanced fine-grained information fusion method for multi-view clustering. Specifically, a linear layer is used to map raw multi-view data from different dimensions to a common dimension. Then, a fine-grained information extraction layer consisting of two convolution layers is employed to extract fine-grained information, which allows detailed information to be represented sufficiently. The self-attention learning module is utilized to determine which information the model should focus on and to fuse important information into a new feature representation. We adopt deep divergence-based clustering to maintain the compactness within a cluster and separation between clusters. Contrastive learning is leveraged to learn consistent clustering results between different views. We conduct experiments on multiple datasets, and the results demonstrate the effectiveness of our proposed self-attention-enhanced fine-grained information fusion method for multi-view clustering.
Xiao YuHui LiuYan WuCaiming Zhang
Weifu ZhuZhipeng QiuZhixia ZengRuliang XiaoZhang Shi
Zhenqiu ShuYunwei LuoYuxin HuangCunli MaoZhengtao Yu
Hongwei YinGuixiang WangWenjun HuZhao Zhang
Y. L. WangXiaobing PeiHaoxi Zhan