DIMACS-CTS (National Chiao Tung University) Conference on the Interconnections among Codes, Designs, Graphs, and Molecular Biology

Dates: May 24 - 26, 2002
National Center of Theoretical Science, National Chiao Tung University, Hsinchu, Taiwan

Frank Hwang, National Chiao Tung University, fhwang@math.nctu.edu.tw
Fred Roberts, DIMACS, Rutgers University, froberts@dimacs.rutgers.edu
David Torney, Los Alamos National Labs, dct@lanl.gov
Presented under the auspices of the DIMACS Special Focus on Computational Molecular Biology.
Discrete structures have been used to formulate and solve some of the most fundamental concepts of molecular biology. In this workshop, we will explore:

We will explore combinatorial design issues arising in the identification of clones containing a specific DNA sequence, coding theory problems with codewords implemented as DNA sequences and graph theoretical concepts that arose originally from DNA reconstruction when the ends of some of the clones are radioactively tagged.

The following are some of the interconnected topics to be explored in this workshop:

We will explore the use of group testing methods in identifying positive clones, which is a crucial step in physical-map-based sequencing. The method of group testing, using combinatorial designs, tests a group of clones at a time, hence reducing the number of samples to be collected and assayed. The focus will be on nonadaptive algorithms for fast implementation. More generally, we will explore new types of design desiderata motivated by screening assays undertaken in various applications such as pooling to find "positive" subsets of molecules, in the study of inhibitory molecules, and in data mining problems in functional genomics.

There is a long history of interrelations among combinatorial design and coding theory and we will explore the use of DNA sequences as codewords in the design of new types of codes. Similar considerations of finite sequences from a finite alphabet (sometimes called categorical sequences) arise in the analysis of genome-scan genotype data (such as marker data from the GAW 12 dataset) and constitute a generalized type of group testing. Various metrics underly the development of codes, and we will explore related work on categorical sequences involving the development of new metrics for sequence similarity.

A very useful tool in the physical mapping of DNA is the optical mapping technique that starts with many single, partially digested copies of a DNA molecule. Realistic algorithms, dealing with false positive and false negative and otherwise inaccurate data, use algorithms that combine continuous optimization with discrete methods and have recently found connections to the types of considerations arising in DNA coding.

In early approaches to realistic physical mapping of DNA, when typically only partial overlap data among clones exists, tagging the ends of some clones radioactively formed the basis for useful mapping procedures. The task of DNA reconstruction could then be looked at as the problem of characterizing the graphs called tagged probe interval graphs and constructing tagged representations for them when they existed. Since this earlier work, special classes of tagged probe interval graphs and other types of intersection graphs have arisen in fascinating ways from overlap of categorical sequences of letters from a finite alphabet and other bioinformatics applications. We will explore a variety of graph-theoretical and other algorithms for such problems as sequence-fragment assembly.

Next: Call for Participation
Workshop Index
DIMACS Homepage
Contacting the Center
Document last modified on November 26, 2001