Applications based on peer-to-peer (P2P) protocols have become tremendously popular over the last few years, now accounting for a significant share of the total network traffic. To avoid restrictions imposed by network administrators for various reasons, the P2P protocols have become more sophisticated and employ various techniques to avoid detection and recognition with standard measurement tools. This paper present three P2P traffic metrics and applies semi-supervised clustering to identify P2P applications. The semi-supervised classification method consist two steps: Particle Swarm Optimization (PSO) clustering algorithm was employed to partition a training dataset that mixed few labeled samples with abundant unlabeled samples. Then, available labeled samples were used to map the clusters to the application classes. Three P2P traffic metrics: IP Address Discreteness, Success Rate of Connections and Bi-directional Connections rate made up the sample and used in this paper. Experimental results using traffic from campus showed that high P2P traffic classification accuracy had been achieved with a few labeled samples.
Jeffrey ErmanAnirban MahantiMartin ArlittIra L. CohenCarey Williamson
Kun DaiHongyi YuQing LiXia Zhang
Jinhui NingYu WangJie YangHaris GacaninSong Ci