DIMACS TR: 2008-08
Anomaly Detection by Reasoning from Evidence in Mobile Wireless Networks
Authors: Nikita I. Lytkin, William M. Pottenger and Ilya B. Muchnik
Anomaly detection is concerned with identification of abnormal patterns
of behavior of a system. Traditional supervised machine learning methods
of classification rely on training data in the form of labeled data
instances representative of each class (e.g. normal vs anomalous data).
Clustering methods, on the other hand, do not require a priori
knowledge of how anomalies are represented in the data space, and are
therefore particularly suitable for anomaly detection.
Partitional clustering methods such as K-means require the number $K$ of
clusters to be specified by a user. Three heuristics that rely on a
joint use of two partitional clustering methods for determining an
appropriate number of clusters in a dataset are proposed in this work.
The heuristics were first evaluated on synthetic data and then applied
on real-world data from the domain of computer network security.
Experimental results demonstrated that clustering methods are adequate
for detection of large-scale anomalous events in the Internet.
Scalability of the heuristics across domains of application was
indicated by additional experimental results obtained on several
datasets from the UCI machine learning repository.
Paper Available at:
DIMACS Home Page