DIMACS Seminar on Math and CS in Biology

Title:

Gene recognition problems that can be solved (almost) exactly

Speaker:

Mikhail Gelfand

Institute of Protein Research, Russian Academy of Sciences

Place:

Seminar Room 431, CoRE Building

Rutgers University.

Time: - Note Special Day and Time

10:00 a.m.

Monday, November 18, 1996

Abstract:

Traditionally gene recognition programs try to minimize some error measure dependent both on the number of false positive and false negative predictions. Thus their performance is characterized by average correlation between predicted and actual genes or similar measures. However, a predicted complete gene that is approximately 80% correct (the current state of art) has only a limited value to an experimental biologist. A more useful result would be a less ambitious, but much more reliable prediction. We are developing two algorithms aiming at such prediction. The first algorithm is gene recognition in a situation when a related protein is known. It is based on the recently developed spliced alignment technique and currently provides at least 98% accuracy in recognition of human genes given mammalian relatives. If a related protein is unavailable, exact resolution of the exon-intron structure by purely statistical is unattainable. However, it is possible to make reliable partial predictions that can be immediately used for experimental gene structure verification. In this vein, we propose a highly specific algorithm for contruction of oligonucleotide probes and PCR primers, which uses only very simple statistical parameters and thus can be used not only for analysis of mammalian genome, but for much less studied genomes as well.

This is joint work with A.Mironov, P.Pevzner and M.Roytberg

Document last modified on October 30, 1996