### Richard Desper

### Rutgers University

### Nearly Tight Bounds on the Learnability of Evolution

Evolution is often modeled as a stochastic process which modifies
DNA. One of the most popular and successful such processes are the
Cavender-Farris (CF) trees, which are represented as edge weighted
trees. The Phylogeny Construction Problem is that of, given $k$
samples drawn from a CF tree, output a CF tree which is close to the
original.

Each CF tree naturally defines a random variable, and the
gold-standard for reconstructing such trees is the Maximum Likelihood
Estimator of this variable. This approach is notoriously
computationally expensive.

Our research has shown that a very simple algorithm, which is a
variant on one of the most popular algorithms used by practitioners,
converges on the true tree at a rate which differs from the optimum by
a constant. We do this by analyzing upper and lower bounds for the
convergence rate of learning very simple CF trees, and then show that
the learnability of each CF tree is sandwiched between two such
simpler trees. Our results rely on the fact that, if the right metric
is used, the likelihood space of CF trees is smooth.