DIMACS TR: 93-67
Approximation Clustering for Homogeneous Data Tables
Author: Boris Mirkin
Clustering has become an important area of applications for combinatorial
optimization techniques. In data analysis, clustering is a method aimed
as aggregating data for better understanding of those data.
Such an aim can be expressed through
various criteria. This requires analyzing properties of the criteria and
algorithms to evaluate the extent to which they correspond to the primary goal;
such an analysis becomes as important a problem as the traditional problem of
developing algorithms themselves. The author describes some of the
results on clustering considered as an approximation of data
(presented as a similarity matrix or contingency table) by discrete structures.
Four kinds of clustering structures are
of particular interest in the report: partitions, sets, stars, and boxes.
DIMACS Home Page