Yifei ChenFeng LiuBram VanschoenwinkelBernard Manderick
This paper focuses on the use of support vector machines on a typical context-dependent classification task, splice site prediction. For this type of problems, it has been shown that a context-based approach should be preferred over a transfor- mation approach because the former approach can easily incorporate statistical mea- sures or directly plug sensitivity information into distance functions. In this paper, we designed three types of context-sensitive kernel functions: polynomial-based, radial basis function-based and negative distance-based kernels. From the experimental re- sults it becomes clear that the radial basis function-based kernel with information gain weighting gets the best accuracies and can always outperform their simple non-sensitive counterparts both in accuracy and in model complexity. And with well designed fea- tures and carefully chosen context sizes, our system can predict splice sites with fairly high accuracy, which can achieve the FP95% rate, 3.94 for donor sites and 5.98 for acceptor sites, an approximate state of the art performance for the moment.
Yong ZhangChao‐Hsien ChuYi‐Ping Phoebe ChenHongyuan ZhaXiangling Ji
Sören SonnenburgGabriele SchweikertPetra PhilipsJonas BehrGunnar Rätsch
Tanasanee PhienthrakulBoonserm Kijsirikul
Nor Azizah HitamAmelia Ritahani IsmailRuhaidah SamsudinEman H. Alkhammash