Multi-omics studies are believed to provide a more comprehensive picture of a complex biological system than traditional studies with one omics data source. However, from a statistical point of view data integration implies non-trivial challenges. In this review, we highlight recent statistical inference and learning techniques that have been devised in this context. In the first part of our article, we focus on techniques to identify a relevant biological sub-system based on combined omics data. In the second part of our article we ask, in which way integrated omics data could be used for better personalized patient treatment in a supervised as well as unsupervised learning setting. Different classes of algorithms are discussed for both application tasks. Existing and future challenges for data integration methods are pointed out.
Siva Prasad AkulaRaghava Naidu MiriyalaHanuman ThotaAllam Appa RaoSrinubabu Gedela
Leann LacCarson K. LeungPingzhao Hu
George C. TsengDebashis GhoshXianghong Jasmine Zhou