DIMACS Seminar on Math and CS in Biology


An Exploration in Phylogenetic Space or How I Learned to Love the Tree


Dr. Peter Smouse
Center for Theoretical and Applied Genetics
Cook College, Rutgers University


DIMACS Seminar Room, CoRE Building, Room 431
Busch Campus, Rutgers University


3:00 PM
Monday, April 24, 1995


We have been exploring minimum spanning trees (MSTs) in a context of human evolution, as determined for the mtDNA genome. There is enough ambiguity in the data set, due to homoplasic mutation, that the number of MSTs (each a maximally parsimonious solution) is truly vast, far too many to evaluate exhaustively. In the process of exploring the "tree space", we discover that "good trees" share several features: (1) they are all quite similar (and I will present a method of quantifying what I mean by the term "similar"); (2) they all resemble the data closely (and I will present a way of gauging that); (3) they minimize the sums of squared distances among all possible pairs of entries, measured along the tree. While the enumeration problem is Np-hard, defining the class of "good trees" is not difficult, and the class occupies a portion of tree space that we might be able to delineate effectively. The geometry of the problem, and how it intersects with the graph theory, lends itself to such standard techniques as principal coordinates analysis, quadratic programming, and so on. Quite apart from the mathematical points, we learn interesting things about translating the data into tree form and what that means and what it doesn't mean.
Document last modified on April 24, 1995