In this paper, we address several notions of similarity between two
RNA strings that take into account both the primary sequence and
secondary base-pairing structure of the strings. We present efficient
algorithms for exact matching and approximate matching between two RNA
strings. We define a notion of alignment between two RNA strings and
devise algorithms based on dynamic programming. We then present a
method for optimally aligning a given RNA sequence with unknown
secondary structure to one with known sequence and structure, thus
attacking the structure prediction problem in the case when the
structure of a closely related sequence is known. The techniques
employed to prove our results include reductions to well-known string
matching problems, allowing wild cards and ranges, and speeding up
dynamic programming by using the tree structures implicit in the
secondary structure of RNA strings.
Paper Available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/1996/96-30.ps.gz