DIMACS TR: 2006-08

Hyper-Rectangular and k-Nearest-Neighbor Models in Stochastic Discrimination

Authors: Iryna Skrypnyk and Tin Kam Ho


The stochastic discrimination (SD) theory considers learning as building models of uniform coverage over data distributions. Despite successful trials of the derived SD method in several application domains, a number of difficulties related to its practical implementation still exist. This paper reports analysis of simple examples as a first step towards presenting the practical implementation issues, such as model generation and preliminary estimations to set parameters. Two implementations using different methods for model generation are discussed. One uses the nearest neighbor approach to maintain the projectability condition, the other constructs hyper-rectangular regions by randomly selecting subintervals in each dimension. Analysis of these implementations shows that for high-dimensional data, parallel model generation with the nearest neighbor approach is a favorable alternative to the interval model generation with random manipulation of the feature subspaces.

Paper Available at: ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2006/2006-08.pdf
DIMACS Home Page