## DIMACS TR: 98-15

## Bellman Functions on Trees for Segmentation, Generalized
Smoothing, Matching Multi-Alignment in Massive Data Sets

### Authors: Ilya Muchnik and Vadim Mottl

**
ABSTRACT
**

A massive data set is considered as a set of experimentally
acquired values of a number of variables each of which is associated with the
respective node of an undirected adjacency graph that presets the fixed structure
of the data set. The class of data analysis problems under consideration is
outlined by the assumption that the ultimate aim of processing can be represented
as a transformation of the original data array into a secondary array of the same
structure but with node variables of, generally speaking, different nature, i.e.
different ranges. Such a generalized problem is set as the formal problem of
optimization (minimization or maximization) of a real-valued objective function of
all the node variables. The objective function is assumed to consist of additive
constituents of one or two arguments, respectively, node and edge functions. The
former of them carry the data-dependent information on the sought-for values of
the secondary variables, whereas the latter ones are meant to express the a priori
model constraints. For the case when the graph of the pair-wise adjacency of the
data set elements has the form of a tree, an effective global optimization
procedure is proposed which is based on a recurrent decomposition of the initial
optimization problem over all the node variables into a succession of partial
problems each of which consists in optimization of an intervening function of only one variable
like Bellman functions in the classical dynamic programming. Two kinds of
numerical realization of the basic optimization procedure are considered on the
basis of parametric representation of the Bellman functions, respectively, for
discretely defined and quadratic node and edge functions. The proposed theoretical
approach to the analysis of massive data sets is illustrated with its
applications to the problems of segmentation, smoothing, fine texture analysis and matching of visual images and geophysical explorative data, as well as to the problem of multi-alignment of long molecular sequences.

Paper Available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1998/98-15.ps.gz

DIMACS Home Page