What It All Means

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

This is the Title of the Book, eMatter Edition

114

Chapter 7: A BLAST Statistics Tutorial

melanogaster genome. On the other hand, it appears that looking for short—less

than 15 base-pair—cis-regulatory elements using either version of BLASTN with the

default parameters is unlikely to be successful.

So what was the unreported WU-BLASTN Expect? Let’s calculate it. With the data

in Table 7-3 and the previously calculated effective HSP length of 294, first calculate

m´ and n´ using the Perl functions

effectiveLengthSeq and effectiveLengthDB. Plug-

ging m´ and n´ together with the WU-BLASTN λ and k and a raw score of 125 into

the

rawScoreToExpect function gives an Expect of 281. Recall that the NCBI-BLASTN

Expect was 1e

-6

. That’s a 281-million-fold difference. BLAST is clearly parameter-

sensitive! Using the default parameters, you instructed NCBI-BLASTN to search for

short highly conserved regions, and it found one. WU-BLASTN, on the other hand,

is parameterized to look for large regions of relatively low percent identity. This

would be fine for cross-species searches of poorly conserved exons but is inappropri-

ate for finding oligos.

Using BLAST intelligently requires using the correct parameters for the task at hand

and not placing too much faith in the reported Expect. See the section on BLAST

protocols in Chapter 9 for practical suggestions on BLAST parameter choice.

Remember, you get what you look for.

What It All Means

You now know how bit scores, sum scores, Expects, and P-values are calculated.

You’ve also seen first-hand that scoring matrices and target frequencies aren’t merely

theoretical abstractions but realities that determine the outcome of a BLAST search.

In some ways, choosing the right scoring scheme for a BLAST search is like choosing

the right pair of eyeglasses. If your scoring scheme is too stringent, BLAST becomes

nearsighted and will miss distant homologies. If your scheme is too lenient, BLAST

becomes farsighted and fails to detect the obvious. Unfortunately, there’s no optimal

scoring scheme. As in real life, sometimes the best you can do is put on bifocals.

You’ve also seen that searching the same sequence and database with varied parame-

ters can result in different alignments having very different Expects. Scores and E-val-

ues aren’t implicit in a sequence or an alignment; they are solely contingent upon

parameter values and the methods used to assess significance. There is nothing abso-

lute about a BLAST significance value; it merely denotes the significance of an align-

ment in the context of a given search. Like everything else in bioinformatics, the

biological implications of a (significant) alignment are inferred by the user and

should be tested experimentally, if possible.

Hopefully, you’ve also learned that there is more to Karlin-Altschul statistics than

simply calculating an Expect for an alignment. Karlin-Altschul statistics provide a

theoretical framework from which to interpret alignment scores in the context of

parameter choice. They also give you the means to tune BLAST for specific purposes.

This is the Title of the Book, eMatter Edition

Where Did My Oligo Go?

115

Without them, you’d have no way of knowing what a given scoring scheme was

looking for, and you’d cast around in the dark for the right set of parameters. Karlin-

Altschul statistics remove the mystery from parameter choice. BLAST certainly has

its limitations, but thanks to its statistical foundation, at least you know what you’re

looking for.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for What It All Means

Create new playlist

Sign In

Sign Up

Table of Contents for
What It All Means