Workshop on Sequence Alignment

November 10 - 12, 1994
Princeton University, Princeton, NJ

Alberto Apostolico, Purdue,
Presented under the auspices of the Special Year in Mathematical Support for Molecular Biology.

Call for Participation





{\bf First Call for Participants}

Topics of this Workshop include all theoretical and practical issues 
arising in biological sequence alignment, with some  emphasis on multiple 
sequence alignment and some consideration for general sequence comparisons. 
Based on the spirit and goals of this DIMACS Special Year, this Workshop
aims at serving as an arena for a structured, interdisciplinary discussion
of the motivations, past accomplishments and future directions of the area
rather than as a standard display of recent results.  Along these lines, 
critical re-visitations of known results are welcome. Submission of recent 
results of specialistic nature should clearly demonstrate how those results 
further the stated objectives. Biologists are specially encouraged to report 
on the successes and failures of the existing methods, and to contribute to
a deeper understanding of the related biological issues. 
Contributions will be distributed in three main {\bf categories} as follows: 


\noindent{\bf 1 - Two-sequence Alignment:} History (1.1), Models (1.2), 
Statistical Significance (1.3), Algorithms and Programs (1.4). 

\noindent{\bf 2 - Multiple Sequence Alignment:} Motivation (2.1), 
Models (2.2), Algorithms and Programs (2.3). 

\noindent{\bf 3 - Relationships between Alignment and Comparison Methods.} 
Dot-plots and Related Filtering Techniques (3.1), Path Extension Heuristics 
(3.2), Data Base Searches (3.3).


The Workshop will feature circa 10 Main Lectures, about as many 
Contributed Papers, a Poster Session (including, in particular, Open 
Problems), and a Software Demonstration Lab (open, in particular, to 
participants in this Special Year Algorithmic Implementation Challenge). 
Industrial involvement, strongly applied contributions and junior 
participants are welcome in particular. 
All activities are coordinated by a Workshop Committee, some members of which 
will also deliver the Main Lectures. Preliminary Workshop Committee:   
S. Altschul (NIH), A. Apostolico (Padova and Purdue, Chair), 
D. Brutlag (Stanford), R. Doolittle (UCSD), M. Farach (Rutgers and DIMACS), 
R. Giancarlo (Bell and DIMACS), P. Green (Washington U.), 
D. Gusfield (UCD), J. Kececioglu (UCD),
D. Lipman (NIH), W. Miller (PennState), P. Pevzner (PennState), 
D. Sankoff (Montreal), M. Vingron (GMD), M. Waterman (USC). 

Deadline for all submissions is June 30, 1994. To propose a
Contributed Paper, send title, (sub)category classification and 
a three-pages abstract. To enroll for a Poster Presentation it is sufficient 
to send a title. To enroll for a Software Demo, send its title and a 
short abstract, specifying duration and possible special equipment needed. 

All technical correspondence should be directed, preferably by e-mail
or (in extreme cases) by fax, to the Scientific Secretariat of the 
(Dr. Raffaele Giancarlo, AT\&T Bell Laboratories, Room 2C 454, 600 Mountain 
Avenue, Murray Hill, NJ 07974 - 0636;  fax: [908] 582 5857).

The Final Program will be distributed September 15, 1994. 
Inquiries concerning issues related to registration, logistics, etc. 
should be directed to: 
(Ms.  Sandra Barbu, Computer Science, Princeton University,
Princeton, NJ  08540
voice: [609] 258-4562 or 5030; 
fax: [609]258-1771. 

An extended version of this Call, with more information on the nature of 
contributions, is either appended or may be accessed by anonymous 
ftp to: 


Sequence Alignment constitutes one of the basic tools used in
Molecular Biology to assess the likelihood of relationships
among biological sequences. Over the years, a
theory of sequence alignment has been developed and 
translated into computational tools useful to the biologist. Traditionally, 
the evolution of such a theory and its fallout have
mostly taken place within the Biomathematical and Molecular Biology
community. In recent years, there has been an increasing 
partecipation of Computer Scientists to the process of developing
algorithmic tools for sequence alignment. As the area drives more and 
more effort from an heterogeneous community of researchers, the chances 
increase that its original motivations and objectives will become lost.
The goal of this Workshop is a critical re-visitation of the biological 
and mathematical foundations, historical development, and achievements
of sequence alignment methods. We have identified three main
categories of focus and a few (not always disjoint) 
sub-categories within them, as follows. 


\centerline{\bf 1 - Alignment Between Two Sequences} 
\noindent {\bf 1.1 History:} Contributions to this part should clarify the
original needs that motivated the study of sequence alignment as well
as the reasons why it evolved in a certain direction rather than
another.  In addition to the technical contents, the presentation(s) should
emphasize what the biologists expected from sequence alignment and how
much of such expectation has been fulfilled. It would be
interesting to understand in particular what were the major
results and biological findings that led designers of sequence
alignment to pursue some avenues of research rather than others. 


\noindent {\bf 1.2 Models}: This part should address the bio-mathematical 
theory of sequence alignment, possibly by means of a comprehensive
review of the various models and of the notions of similarity between
sequences that they are trying to capture. For each model, it would be
interesting to understand how the choice of parameters in the model
(e.g., weighting functions) is carried out. 


\noindent {\bf 1.3 Statistical Significance}:
It would be interesting to understand
what are the statistical tools used in this area to establish the
significance of an alignement and how well they model the
underlying biological knowledge. 
In addition, it is highly 
desirable to present the statistical theory used in establishing
the significance of an alignment.  


\noindent {\bf 1.4 Algorithms and Programs}: This part should cover 
how the models are translated into computational tools. The emphasis should 
be on what are the trade-offs between the sophistication of the theoretical 
model and the difficulties associated with their implementation. 
It would be interesting to point out which subtleties of the model 
are neglected in the computation and why. Finally, there should be also
a presentation of the computational problems associated with the
statistical significance of the model (such issue is hardly discussed
in the literature).  For instance, the determination of which weights
to pick may represent a  computationally intensive ``learning problem''. 


\centerline{\bf 2 - Multiple Sequence Alignment}


\noindent {\bf 2.1 Motivation}: The presentations in this area should try to 
outline which biological problems motivate the study of multiple
sequence alignment and what is expected  from it.


\noindent {\bf 2.2. Models}: Analogous to 1.2. 


\noindent {\bf 2.3 Algorithms and Programs}: The current formulations  of 
multiple sequence alignment turn out to be computationally intensive.
Usually, one can ``go around'' such a problem by computing an
``approximation'' of what was originally intended. Again, it would be
interesting to know how the designers and implementors of multiple
sequence alignment algorithms and programs deal with such difficulty.
Moreover, it would also be very interesting to have an account on what
are the computational needs that arise in establishing the
significance of an alignment.


\centerline{\bf 3 - Sequence Alignment and Sequence Comparison}

Sequence alignment refers to the collection of 
dynamic programming (DP) techniques used in establishing the similarity 
or homology of two or more sequences. Sequence comparison is used to 
refer to: (i) the broader repertoire of techniques used for those 
purposes; (ii) applications, most notably, data base searches, 
where adaptations of DP-driven sequence alignment 
serve ancillary tasks in the pursuit of homologies. Contributions 
to sub-categories 3.1 and 3.2 should primarily aim at a comparative analysis 
of the relative merits of the methods. Contributions to 3.3
may concentrate instead on the technical issues related to the efficient 
organization and search of bio-sequences.

Prev Previous: Announcement
Next Next: Program
Index Workshop Index
DIMACS Home Page
Contacting the Center
Document last modified on November 1, 1994