The Sixth DIMACS Implementation Challenge:
Available Software


Sample Templates

(as described in Developers Guide)
Search Algorithm ( gzip'd tarfile)
algorithm.h
algorithm.c (<-- this will be replaced by you)
client.h
client.c
messages.h
oracle.h
search.c (<-- you may want to modify this too)
Makefile
Data Set Oracle ( gzip'd tarfile)
euclid.c (<-- this will be replaced by you)
messages.h
oracle.h
server.h
server.c
Makefile

Available Search Algorithms

Brute Force
Contributor: Michael Goldwasser, wass@cs.princeton.edu
Language: C
Available as: [ gzip'd tarfile (6K) ]
Description: This algorithm is a brilliant one, using almost no space, no preprocessing, and using only linear time to answer queries.

SR-Tree
Contributor: Norio Katayama, katayama@rd.nacsis.ac.jp
Version:1.0 (August 31 1998)
Language: C or C++
Available as: [ gzip'd tarfile (184K) ]
Description: Click here for information about these participants and the algorithm.

ANN (includes kd-tree implementations)
Contributor: David Mount, mount@cs.umd.edu
Language: C++
ANN Home Page: http://www.cs.umd.edu/~mount/ANN/
Description: ANN is a library written in the C++ programming language to support both exact and approximate nearest neighbor searching in Euclidean spaces of various dimensions. In particular, ANN implements kd-trees with a choice of several possible splitting rules, as well as other box-decomposition trees.

To download ANN along with documentation on using the library, please see the ANN home page.

In addition, if you would like a wrapper around ANN which makes in conform to the "official" Implementation Challenge development standards, you may download the additional package: [ gzip'd tarfile (6K) ]


Available Data Sets and Distance Metrics

Euclid
Contributor: Michael Goldwasser, wass@cs.princeton.edu
Version:1.0 (August 24 1998)
Language: C
Available as: [ gzip'd tarfile (6K) ]
Description: This program provides an example of an oracle for a Euclidean metric space. The points can be either read from a file or generated uniformly at random from the unit hypercube. Additionally, points can be color-coded for classification (optionally). The distance metric can be set to the L_p norm for any positive integer p, or to the L_infinity norm. (This oracle combines and replaces the previous "randcube" and "color_euclid" oracles)

Text - Edit Distance
Contributor: Michael Goldwasser, wass@cs.princeton.edu (Based on original code provided by Sergey Brin)
Version:1.0 (January 16 1998)
Language: C
Available as: [ gzip'd tarfile (6K) ]
Description: This program provides an example of two distance metrics creating a space over lines of text provided at runtime. The space can be created using one of two edit distance functions as the metric.

Gauss
Contributor: Alfons Juan, ajuan@iti.upv.es
Version:1.0 (May 1998)
Language: C
Available as: [ gzip'd tarfile (13K) ]
Description: gaussora (GAUSSian ORAcle) is an oracle for Euclidean samples generated from a mixture of Gaussian distributions. Additionally, generated samples are color-coded based on the individual distribution responsible for each sample.

Hidden Markov Model
Contributor: Alfons Juan, ajuan@iti.upv.es
Version:1.2 (July 9 1998)
Language: C
Available as: [ gzip'd tarfile (204K) ]
Description: hmmora (Hidden Markov Model ORAcle) is an oracle for string samples generated from a mixture of Hidden Markov Models. Additionally, generated samples are color-coded based on the individual distribution responsible for each sample.

Image Vectors
Contributor: Norio Katayama, katayama@rd.nacsis.ac.jp
Available as: [ gzip'd ascii (3.2M) ]
Description: A file of 40700 Euclidean vectors in 20-dimensional space, generated as feature vectors from images downloaded from NASA. As an experiment, 37000 of these vectors are randomly selected to be in the original data set, with the remaining 3700 used as query points. Click here for more information about the generation of the vectors. The file is formatted for use as input to the Euclid oracle.


Application Specific Protocols

Euclidean: The associated data for a point in a d-dimensional Euclidean metric space should consists of exactly d fields, one for each dimension, numbered from [0..d-1]. The string returned by InqField(ora,p,i) should be a floating point representation for the ith coordinate of point p (currently with maximum precision of 50 characters).
Example Data Sets: Euclid, Gaussian, Image Vectors
Example Algorithms: SR-Tree, ANN (kd-trees)

Color-coded: Our protocol for color-coded sets will be to have the first field equal to the color, specified as an identifying integer. Notice, that this protocol can be placed on top of other protocols for color-coded versions of the protocols by using the first field to be the color, and offsetting all other fields by one.
Example Data Sets: Euclid, Gaussian, Hidden Markov Model
Example Algorithms:
Index Sixth DIMACS Implementation Challenge.

Index Previous DIMACS Implementation Challenges.

Index DIMACS Home Page.


Maintained by Michael Goldwasser
challenge6@dimacs.rutgers.edu
Document last modified on May 26, 1999.