DIMACS Mini Workshop: Exploring Large Data Sets Using Classification, Consensus, and Pattern Recognition Techniques

Multiple Alignment of Biomolecular Sequences and Voting Paradoxes

Sandia National Laboratories (Massively Parallel Computing Research Laboratory MS 1110 Albuquerque, NM 87185)

R. Ravi (Carnegie Mellon)

 

  The notion of multiple sequence alignment appeared for the first time
in a 1963 paper by L. Pauling and E. Zuckerkandl entitled: "Chemical
Paleogenetics: Molecular Restoration Studies of Extinct Forms of
Life."  Multiple sequence alignment is today one of the most vibrant
and active research areas of computational molecular biology. Although
it plays a founding role in the development of the field, and a vital
role in the recent breakthrough discoveries, the research in this area
is facing challenges in the development of methods that are
simultaneously: well-defined, biologically meaningful, and
computationally feasible.

  The situation resembles in spirit the mathematical economics line of
research pioneered by the axiomatic approach of Kenneth J. Arrow, who
asked whether one c an design a system of voting that is at the same
time: rational, decisive and fair.

  Voting -- as studied by economists, political scientists, and
philosophers -- and multiple sequence alignment are methods of
aggregation of a set of individua l patterns into a collective pattern
that represents in a fair and meaningful way the set of individual
patterns.

  The 1972 Nobel Prize for Economics was given to Arrow for his
discovery of the
 
  Voting Impossibility Theorem: "No voting method could be
simultaneously rational , decisive and fair!"

  In this talk we will present preliminary results on a similar
axiomatic approa ch for understanding the intrinsic difficulties in
multiple sequence alignment, for the particular case when no
phylogenetic tree information is available.  We will review the
literature, focusing on the search for guiding principles tha t are
well-agreed upon for creating biologically significant alignments.  We
have identified three basic Axioms, proposed in the literature, which
in our view capture the essence of the competing difficulties:
well-definedness, biological meaningfulness, and computational
feasibility. No alignment method in the literature satisfies
simultaneously these three axioms.

  We will present (1) a new alignment method, the Perron-Frobenius
Alignment, th at appears to be the closest in the literature to
satisfying the three axioms, and (2) a new unifying framework for
combinatorial optimization alignment problems, based on "fair"
objective functions.

  We will conclude by discussing our attempts of proving a 
"Molecular Restoration Impossibility Theorem."