Since 1998, we have been using a somewhat unusual null model with our hidden Markov models: the reverse-sequence null model. This null model uses the same computation as the stochastic model itself, but applies it to the reversed sequence:
P(x| NULL) = P(reversal(x) | M)This reverse-sequence null model has been very effective at cancelling "noise" signals such as compositional bias and helicity in our fold-recognition tests.
To use a stochastic model effectively in database search, we need a way to compute the statistical significance of a score (usually expressed as the E-value---the expected number of hits that good by chance). In this talk I'll derive a distribution for the E-values for reverse-sequence null models, and show how to fit the parameters of the distribution.
Results on recent fold-recognition tests will be shown.