The emergence of the computer as an essential tool in scientific research and as an essential ingredient in commercial systems has led to the generation of massive amounts of data. These data sets are of critical importance for a broad variety of applications, including (but certainly not limited to) astrophysical models, genetic sequencing, geographic information systems, ecological monitoring, weather prediction, telecommunications applications, commercial digital video and audio, digital libraries, government information systems, and biological models for medical applications. Researchers in all of these applications areas currently face daunting computational problems in organizing and extracting useful information from these massive data sets.
In an effort to acquaint the mathematical sciences community with the problems and challenges is this area, the Division of Mathematical Sciences and DIMACS have organized a session on massive data sets at the San Diego meeting of the American Mathematical Society in January. This session will describe some of the challenging mathematical, statistical, and algorithmic problems inherent in organizing and using enormous amounts of data. The emphasis will be on basic issues that transcend particular applications. Speakers will explain why existing mathematical, statistical, and algorithmic methods break down on the enormous data sets that scientists and technologists now encounter regularly and will attempt to delineate the boundaries at which these breakdowns occur. The session will include a discussion of specific DIMACS and DMS programs that focus on huge data sets.
The session has been organized by Joan Feigenbaum at AT&T Labs. Abstracts of the talks and the schedule of the session will be available on the DIMACS Web page, http://dimacs.rutgers.edu.