Xiang ZhangNaiyang GuanZhilong JiaXiaogang QiuZhigang Luo
Advances in DNA microarray technologies have made gene expression profiles a significant candidate in identifying different types of cancers. Traditional learning-based cancer identification methods utilize labeled samples to train a classifier, but they are inconvenient for practical application because labels are quite expensive in the clinical cancer research community. This paper proposes a semi-supervised projective non-negative matrix factorization method (Semi-PNMF) to learn an effective classifier from both labeled and unlabeled samples, thus boosting subsequent cancer classification performance. In particular, Semi-PNMF jointly learns a non-negative subspace from concatenated labeled and unlabeled samples and indicates classes by the positions of the maximum entries of their coefficients. Because Semi-PNMF incorporates statistical information from the large volume of unlabeled samples in the learned subspace, it can learn more representative subspaces and boost classification performance. We developed a multiplicative update rule (MUR) to optimize Semi-PNMF and proved its convergence. The experimental results of cancer classification for two multiclass cancer gene expression profile datasets show that Semi-PNMF outperforms the representative methods.
Xiang ZhangNaiyang GuanZhigang LuoXuejun Yang
Pengyu LiChristine TsengYaxuan ZhengJoyce A. ChewLongxiu HuangBenjamin JarmanDeanna Needell
Qingyao WuMingkui TanXutao LiHuaqing MinNing Sun
Yanhua ChenManjeet RegeMing DongJing Hua
ChenYanhuaRegeManjeetDongmingHuajing