Abstract
The increase in Internet and Internet based application, the business premises have now spread throughout the world. Due to the extreme competitions among the business, one tries to demolish other. Hence, secure product design techniques should be adopted. To protect the applications from intruder, intrusion detection system becomes utmost requirement for every organization. In intrusion detection models enormous quantity of training data is required. As a result, sophisticated algorithms and high computational resources are required. In Intrusion Detection System, to separate normal activities from abnormal activities clustering algorithms are used. To select an efficient clustering algorithm is a challenging task. In this paper, a comparison has been made between K-Means and C-Means clustering on intrusion datasets. The simulation contains all proximity measures of K-Means and C-Means clustering techniques. The accuracy of these clustering algorithms is compared using the confusion matrix. The result shows that K-Means provides better clustering accuracy in comparison with C-Means. Therefore, to design intelligent intrusion detection product K-Means is a better option.
I. INTRODUCTION
In the present day it is highly essential to design intelligence software products which can withstand zero day attacks. The innovative product development is utmost essential to every software firm. They should focus on how the product is survive in an insecure medium like the internet. Interdisciplinary concepts are required to tolerate the unusual activities. The term intrusion comprises a set of attempts to compromise the confidentiality, integrity and availability of information resources. Intrusion detection is the process of monitoring the events in the system and analyzing the network packets to or from the network. Intrusion detection system automates the process and counteract the intrusive efforts. The intrusive efforts can be caused by insiders or outsiders in the system. The intruder can be classified as clandestine, misfeasor and masquerader [1].
V. CONCLUSION
Two clustering techniques based on intrusion datasets have been reviewed in this paper. These clustering techniques with different similarity measures are implemented, evaluated and compared using intrusion datasets. The comparative study discussed here is concerned with the accuracy of each algorithm, with care being taken towards the accuracy in calculation and other performance related measures. It is found that the K-Means clustering algorithm provides better accuracy and consumes less time in comparison to C-Means clustering on these datasets.
The clustering techniques discussed here don’t have to be used alone to predict different attacks. As the initial centroids are chosen randomly, the class distribution may change or evolve on each execution. Therefore, it should be used in conjunction with other data mining algorithms for better accuracy.