Intrusion Detection Systems (IDS) has become an indespensive tool to protect the network by detecting the attacks. As each attack is associated with a large number of attributes, it is challenging to select a good set of attributes for achieving better classification accuracy. This work proposes an effective unsupervised feature selection technique using Genetic algorithm (GA) for analyzing IDS data. The search capability of GA has been utilized for optimizing different unsupervised feature quality measures including Pearson correlation, mutual information, and entropy. Different combinations of these features are utilized as fitness functions of the proposed GA based framework. The algorithm is able to find out that subset of features which are uncorrelated and mutually exclusive to each other. Finally, the optimal feature subset obtained is utilized for developing classification systems using some popular machine learning models like decision trees, support vector machines, k-nearest neighbor classifier on the KDD-Cup 99 dataset. The experimental results show that decision tree produces better results than other classifiers. The result confirms 99.62% accuracy, 98.78% detection rate and 0.25% false alarm rate. The most attractive feature of the proposed scheme is that it does not require any labeled information during the feature selection process.
Mohammed A. AmbusaidiXiangjian HePriyadarsi Nanda
Ammar AlazabMichael HobbsJemal AbawajyMoutaz Alazab
V. AnbumaniSiva RanjithL R NavaneethanR Santhosh
Wentao FanNizar BouguilaDjemel Ziou
Deeman Yousif MahmoodHussein, Mohammed Abdullah