DIMACS TR: 93-67

Approximation Clustering for Homogeneous Data Tables

Author: Boris Mirkin


Clustering has become an important area of applications for combinatorial optimization techniques. In data analysis, clustering is a method aimed as aggregating data for better understanding of those data. Such an aim can be expressed through various criteria. This requires analyzing properties of the criteria and algorithms to evaluate the extent to which they correspond to the primary goal; such an analysis becomes as important a problem as the traditional problem of developing algorithms themselves. The author describes some of the results on clustering considered as an approximation of data (presented as a similarity matrix or contingency table) by discrete structures. Four kinds of clustering structures are of particular interest in the report: partitions, sets, stars, and boxes.

Paper only.
DIMACS Home Page