Finding near neighbors according to various distance measures arises in many application areas, including but not limited to finding similar web documents, images, DNA sequences, lines of text, or audio and video clips, and more generally in areas of statistics, data mining, information retrieval, pattern recognition and data compression. The goal of the Implementation Challenge is to determine how algorithms depend on the structure of the underlying data sets, as well as to determine realistic algorithm performance where worst case analysis is overly pessimistic and probabilistic models are too unrealistic. We invite research projects in the following two directions.
Providing Interesting Data Sets and Distance Measures. We invite researchers from various application areas to provide interesting data sets for near neighbor problems. Contributions could consist either of sample data sets from a true application or of realistic instance generators resembling practical data sets. Our goal is to construct a library of test problems that will be available for study both during and after the Challenge.
Developing Near Neighbor Algorithms. Neighbor algorithms may be developed either for specific application domains or for more general spaces. Algorithms may aim to find the nearest neighbor, several nearest neighbors, or approximate neighbors. Projects may involve either public domain or proprietary codes.
Contents:
Previous
DIMACS Implementation Challenges.