DIMACS Theoretical Computer Science Seminar

Title: Sequence Assembly for High-Throughput Technologies

Speaker: Steve Skiena, State University of New York

Date: April 26, 2004 3:30-4:30pm

Location: DIMACS Center, CoRE Bldg, Room 431, Rutgers University, Busch Campus, Piscataway, NJ


Next-generation sequencing technologies based on pyrosequencing and single molecule methods are extremely promising, however the length and quality of the resulting reads are radically different than those produced by current sequencing machines. We study the space of read length, sequencing error rate, and coverage that lies well outside conventional assumptions to determine the technological/economic parameters where de novo sequencing will be achievable with these new technologies. We demonstrate that genome assembly on bacterial and human sequences is possible (a) with astonishingly short reads, given sufficiently high coverage, and (b) under surprisingly high error-rates, given long enough or plentiful enough reads.

(Joint work with J. Chen and A. Smirnov.)

See Webpage: http://www.cs.rutgers.edu/~muthu/theory.html