DIMACS TR: 2008-08

Anomaly Detection by Reasoning from Evidence in Mobile Wireless Networks



Authors: Nikita I. Lytkin, William M. Pottenger and Ilya B. Muchnik

ABSTRACT

Anomaly detection is concerned with identification of abnormal patterns of behavior of a system. Traditional supervised machine learning methods of classification rely on training data in the form of labeled data instances representative of each class (e.g. normal vs anomalous data). Clustering methods, on the other hand, do not require a priori knowledge of how anomalies are represented in the data space, and are therefore particularly suitable for anomaly detection. Partitional clustering methods such as K-means require the number $K$ of clusters to be specified by a user. Three heuristics that rely on a joint use of two partitional clustering methods for determining an appropriate number of clusters in a dataset are proposed in this work. The heuristics were first evaluated on synthetic data and then applied on real-world data from the domain of computer network security. Experimental results demonstrated that clustering methods are adequate for detection of large-scale anomalous events in the Internet. Scalability of the heuristics across domains of application was indicated by additional experimental results obtained on several datasets from the UCI machine learning repository.

Paper Available at: ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2008/2008-08.pdf
DIMACS Home Page