This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
64
|
Chapter 4: Sequence Similarity
Sequence Similarity
Sequence similarity is a simple extension of amino acid or nucleotide similarity. To
determine it, sum up the individual pair-wise scores in an alignment. For example,
the raw score of the following BLAST alignment under the BLOSUM62 matrix is 72.
Converting 72 to a normalized score is as simple as multiplying by lambda. (Note
that for BLAST statistical calculations, the normalized score is λS – lnk.)
Query: 885 QCPVCHKKYSNALVLQQHIRLHTGE 909
+C VC K ++ L++H RLHTGE
Sbjct: 267 ECDVCSKSFTTKYFLKKHKRLHTGE 291
Recall from Chapter 3 that the score of each pair of letters is considered indepen-
dently from the rest of the alignment. This is the same idea. There is a convenient
synergy between alignment algorithms and alignment scores. However, when treat-
ing the letters independently of one another, you lose contextual information. Can
you assume that the probability of A followed by G is the same as the probability of
G followed by A? In a natural language such as English, you know that this doesn’t
make sense. In English, Q is always followed by U. If you treat these letters indepen-
dently, you lose this restriction. The context rules for biological sequences aren’t as
strict as for English, but there are tendencies. For example, low entropy sequences
appear by chance much more frequently than expected. To avoid becoming side-
tracked by the details, accept that you’re using an approximation, and note that in
practice, it works well.
$lambda = ($lambda + $low)/2;
}
else {
$low = $lambda;
$lambda = ($lambda + $high)/2;
}
}
# compute target frequency and H
my $targetID = Pn * Pn * exp($lambda * $match) * 4;
my $H = $lambda * $match * $targetID
+ $lambda * $mismatch * (1 -$targetID);
# output
print "expscore: $expected_score ";
print "lambda: $lambda nats (", $lambda/log(2), " bits) ";
print "H: $H nats (", $H/log(2), " bits) ";
print "%ID: ", $targetID * 100, " ";
Example 4-1. A Perl script for estimating lambda (continued)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.216.175