## DIMACS TR: 97-71

## A few logs suffice to build (almost) all trees (I)

### Authors: P. L. Erdos, M. A. Steel, L. A. Szekely, and T. J. Warnow

**
ABSTRACT
**

A phylogenetic tree (also called an ``evolutionary tree'') is a leaf-labelled
tree which represents the evolutionary history for a set of species, and the
construction of such trees is a fundamental problem in biology. Here we
address the issue of how many sequence sites are required in order to recover
the tree with high probability when the sites evolve under standard Markov-
style (i.i.d.) mutation models. We provide analytic upper and lower bounds for
the required sequence length, by developing a new (and polynomial time)
algorithm. In particular we show that when the mutation probabilities are
bounded the required sequence length can grow surprisingly slowly (a power of
log n) in the number n of sequences, for almost all trees.

This researched started when the authors enjoyed the hospitality of DIMACS
in 1995.

Paper Available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1997/97-71.ps.gz

DIMACS Home Page