Sandia National Laboratories, Algorithms and Discrete Mathematics, Department, MS 1110 Albuquerque, NM 87185
This talk will present algorithms for analysis, error- correction, and mapping of genomic data. We faced the challenging questions of "What is a genomic map?" and "What is a proof of good performance for a mapping algorithm?" and evaluated the validity of our proposed solutions through software simulations on synthetic and real genomic data. Our primary focus was on hybridization data obtained from clone libraries having chimerism and deletions. We have developed mathematical models of the space of potential hybridization data which yield tractable approximation algorithms while maintaining realistic properties of the data.
Our approach was to identify natural map measures whose optimal or close to optimal values correlate with map quality in the presence of various types of errors. These measures give rise to single- or multiple- objective optimization problems whose optimal solution is invariably computationally intractable. Several methods were then employed in the design of performance-guaranteed approximation algorithms and of heuristics that exhibit a good trade-off between comput ational- performance and biological- accuracy. The techniques range from customary tools that provide approximations for the Traveling Salesperson problems to greedy techniques, and from local-improvement algorithms with provable performance to spectral method heuristics employed in numerical analysis.
(Joint work with David Greenberg (Sandia National Labs))