Selected Accomplishments/Achievements: Working Group on Algorithms for Multidimensional Scaling I

Multidimensional Scaling (MDS): MDS is widely used in the social and behavioral sciences. Its goal roughly is to take a multivariate data set and represent it in a low dimensional Euclidean space so as to minimize any distortion of the data. Often this is a representation in 2 dimensions. At its first meeting, the working group explored nonlinear and nonmetric versions of MDS, fitting of various non-Euclidean representations in both the two- and three- way cases, and the need to develop techniques that can be applied to massive data sets. This last problem, of dealing with massive data sets, is difficult because it will require the development of entirely new techniques, since most of the existing ones are extremely computationally intense and so tend to limit the size of data arrays quite severely. One promising approach involves the random deletion of a substantial portion of the data. Preliminary results indicated that as much as 60% could be deleted without a serious effect on the output. Other approaches involve using heuristic approaches to get close to the solution and then trying to refine the output of the heuristic. This is work done by Willem Heiser and his colleagues from Leiden University. Since one well-known approach to fitting two-way Euclidean MDS models involves a singular value decomposition (SVD) of a derived matrix of scalar products, and since methods already exist for implementing the SVD on very large matrices, one approach, taken by the (unfortunately recently deceased) Mark Rorvig and David Dubin in some collaborative work with Douglas Carroll involved applying methods for SVD of massive data sets to solving this particular version of MDS in the case of extremely large matrices of proximities, involving proximity data on a very large number of stimuli or other objects. Various approaches are being explored for extending such approaches to other, more complex, MDS models and methods.

The main accomplishment of the first meeting of this group was the development and enhancement of cross-disciplinary research efforts. Here are the highlights of these endeavors.

Larry Hubert (Psychology, University of Illinois), Phipps Arabie and Douglas Carroll (Graduate School of Management, Rutgers) together with Michael Brusco (School of Business, Florida State University) are all exploring various mathematical programming techniques to fit MDS models, including various possible collaborative efforts.

David Dubin (Library Science, University of Illinois) Douglas Carroll and Michael Trossett (Math., William and Mary) are all exploring various approaches to MDS of massive data sets including, as already alluded to, possible extensions of some already established research in this area.

There was also the start of or enhancement of collaborations among academic participants and industrial scientists such as Anil Chaturvedi of Kraft Foods and Andreas Buja of AT&T Laboratories.

This material is based upon work supported by the National Science Foundation under Grant No. 0100921

Up. Index of Special Focus on Data Analysis and Mining
DIMACS Homepage
Contacting the Center
Document last modified on April 29, 2003.