Interactions between proteins and small-molecule chemicals modulate many protein functions and biological processes, and identifying these interactions is a crucial step in modern drug discovery. Supervised learning methods for predicting protein-chemical interactions (PCI) have been widely studied, but their performance is largely limited by insufficient availability of binding data for many proteins. In addition, many complex diseases such as Alzheimer's disease and cancers are found associated with multiple target proteins. Chemicals that selectively modulate only one of these target proteins are unable to effectively conquer these diseases. In this paper we propose two multi-task learning (MTL) algorithms for predicting active compounds of multiple proteins related to the same diseases, some of which may have very few binding examples. In the first method we optimize the likelihood of compound features with a Gaussian prior, while the second method boosts compound features using a number of independent boosting classifiers. Experimental studies demonstrate significant performance improvement of our MTL methods over baseline methods. Our MTL methods are also able to accurately identify promiscuous compounds that interact with multiple related proteins.
Anveshi CharuvakaHuzefa Rangwala
Olivier ChapellePannagadatta K. ShivaswamySrinivas VadrevuKilian Q. WeinbergerYa ZhangBelle L. Tseng
Yuren MaoZekai WangWeiwei LiuXuemin LinWenbin Hu
Theodoros EvgeniouMassimiliano Pontil
Yuyou WengChen LinXiangxiang ZengYun Liang