Ziheng GaoNaiyang GuanXuhui HuangXuefeng PengZhigang LuoYuhua Tang
Graph regularized non-negative matrix factorization (GNMF) decomposes a high-dimensional non-negative data matrix into two low-dimensional matrices with the non-negativity property kept and the geometric structure preserved. Due to its effectiveness, GNMF has been widely used in many fields such as computer vision and data mining. However, GNMF cannot process large-scale datasets on distributed system because the gradient of the graph regularization term costs huge amount of communication overheads among computing nodes. In this paper, we proposed a distributed GNMF (DGNMF) algorithm to overcome this deficiency. Particularly, DGNMF reformulates the graph regularization term to avoid multiplying graph Laplacian by factor matrix through introducing an auxiliary variable and incorporating an equality constraint over it. We optimize DGNMF by using greedy coordinate descent method in the frame of augmented Lagrange method and implement this algorithm on a distributed system. Since DGNMF requires quite few communication overheads among computing nodes, it can be applied to large scale dataset. The preliminary results illustrate efficiency, scalability, and effectiveness of DGNMF.
Yingming LiMing YangZhongfei Zhang
Binghui WangMeng PangChuang LinXin Fan
You LuRui LiuHe ZhangZheng Shan
Ziheng GaoNaiyang GuanLongfei Su