## DIMACS TR: 2003-06

## A Data Mining Problem in Stochastic Programming

### Authors: Andras Prekopa and Xiaoling Hou

**
ABSTRACT
**

In this paper we consider a linear programming problem where some or all
technology
coefficients are deterministic but their values are unknown. Samples are
taken to
estimate these coefficients and the problem is to determine the optimal
sample sizes.
If we replace the unknown coefficients by their estimations, then we
obtain a random
linear programming problem the optimum value of which is also random. We
want to
find sample sizes such that the confidence interval, created for the
unknown deterministic
optimum value, by the use of the samples,
should cover it by a prescribed large probability,
and, subject to this constraint, the total cost of sampling should be
minimum.

Paper Available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-06.ps.gz

