# DIMACS Seminar on Math and CS in Biology

## Title:

On the Complexity of String Folding

## Speaker:

- Teresa Przytycka
- University of Maryland

## Place:

- CoRE Building, Conference Room 433
- Busch Campus, Rutgers University

## Time:

- 11:00 AM
- Tuesday, April 2, 1996

## Abstract:

The motivation for the string-folding problems considered during this talk lies
in computational biology. Prediction of the three-dimensional structure of a
protein from its known linear sequence of amino acids is an important practical
open problem, which seems to be extremely challenging. A natural approach is
to look for a spatial configuration achieving a minimum free energy level.
While most people would expect that finding a minimum energy configuration
would be computationally intractable, previous results to this effect are very
limited.

We prove an NP-completeness result for a much-simplified model, in which we
attempt to capture rather more of the essential character of the
protein-folding problem. The protein molecule is represented by a string of
symbols, a bond can be made only between a pair of identical symbols, and we
seek an embedding of the given string in a grid so as to maximise the number of
pairs of matching symbols at adjacent grid points. This version of the folding
problem involves a mixture of combinatorial, geometric and topological
considerations.

Hart and Istrail have given an approximation algorithm for the
``hydrophobic-hydrophilic'' model over the same grids. This corresponds to our
model, but with a binary alphabet in which the only matches counted are between
adjacent 1's.

This is joint work with Mike Paterson.

Document last modified on March 28, 1996