DIMACS TR: 2003-06

A Data Mining Problem in Stochastic Programming

Authors: Andras Prekopa and Xiaoling Hou


In this paper we consider a linear programming problem where some or all technology coefficients are deterministic but their values are unknown. Samples are taken to estimate these coefficients and the problem is to determine the optimal sample sizes. If we replace the unknown coefficients by their estimations, then we obtain a random linear programming problem the optimum value of which is also random. We want to find sample sizes such that the confidence interval, created for the unknown deterministic optimum value, by the use of the samples, should cover it by a prescribed large probability, and, subject to this constraint, the total cost of sampling should be minimum.

Paper Available at: ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-06.ps.gz
DIMACS Home Page