Network Representation Learning (NRL) is a method to learn a representation of a graph in a low-dimensional space, such that the representation can be later utilized easily in various machine learning tasks such as classification, recommendation, and prediction. In contrast to homogeneous networks, heterogeneous information networks (HINs) contain rich semantics and structural information due to multiple types of nodes and edges. Due to heterogeneity, the conventional representation learning methods are not directly applicable. In this paper, we propose a semi-supervised HIN embedding model, adopted from the natural language processing community. The model uses sequences of nodes obtained by random walks constrained on edge types such that the structural and semantic properties are preserved. These sequences correspond to sentences in a document. Each sequence is labeled based on the nodes contained in it. We adopt a 1D-Convolutional Neural Network sentence classification model that seeks to fit a sequence classifier while optimizing the representation of the nodes. We have performed experiments on vertex classification on two widely used realworld datasets, showing better or comparable performance with respect to the state-of-the-art.
Yuya OgawaSeiji MaekawaYuya SasakiYasuhiro FujiwaraMakoto Onizuka
Haodong ZouZhen DuanXinru GuoShu ZhaoJie ChenYanping ZhangJie Tang
Maoguo GongChuanyu YaoYu XieMingliang Xu
Chaozhuo LiZhoujun LiSenzhang WangYang YangXiaoming ZhangJianshe Zhou