The C. elegans Genome Sequencing Project

Richard Wilson

Genome Sequencing Center
Washington University School of Medicine
Department of Genetics
St. Louis, MO 63108


The nematode worm, C. elegans, has proved particularly amenable to genetic analysis of development and neurobiology. Its haploid genome, distributed over six chromosomes, contains about 100 megabases (Mb) nearly all of which have been mapped to 15 large contigs. Following an initial pilot phase sequencing project of a 3 Mb region on chromosome III, our laboratories in Cambridge (UK) and St. Louis (USA) plan to complete the sequence of the entire genome within the next five years.

Our approach to genomic sequencing is to first derive random M13 (1-2 kb) or phagemid (6-9 kb) subclones from cosmids, although where there is no cosmid coverage, libraries have been produced from whole or partial YACs. This "shotgun" phase is followed by a directed "walking" phase to completion. Subclones are sequenced and analyzed on ABI 373A fluorescent gel readers with an overall coverage redundancy of around six-fold. Recent improvements to the ABI 373A gel readers have substantially increased the amount of data which can be obtained from a single run.

Software for assembly and finishing are continually improving : a recent version of the Staden assembly program (XGAP) incorporates several useful features within the contig editor to aid finishing. In addition, novel automation which speeds template preparation and DNA sequencing has been developed at both sites.

We have now sequenced about 6 Mb. Analysis of this region by GENEFINDER suggests an average gene density of I gene per 5 kb. The percentage of ESTs (derived from cDNAs) identified in the genome sequenced so far suggests that the gene density in this region is close to average for the whole genome. Approximately 45% of the predicted genes show homology to previously identified genes from all organisms.

With the sequence of the central region of chromosome III essentially complete, we are now following the same strategy for sequencing chromosome II and the X chromosome.


Program
DIMACS Homepage
Contacting the Center
Document last modified on March 28, 2000.