CHAPTER 15
tBLASTn

CS Mukhopadhyay and RK Choudhary

School of Animal Biotechnology, GADVASU, Ludhiana

15.1 INTRODUCTION

tBLASTn is another type of translated BLAST algorithm, in which an amino acid sequence is used as a query to compare with the translated nucleotide (coding sequence) database. The amino acid sequence is compared at the protein level with each subject nucleotide sequence translated in all six reading frames. Thus, tBLASTn is very useful for searching protein homolog(s) in unannotated nucleotide data such as expressed sequence tags (maintained in BLAST database “est”) and draft genome records (located in the BLAST database “htgs”), which remain unannotated in the respective databases.

15.2 OBJECTIVE

To search for the homologous protein sequences of a pair of given protein sequences (NP_001028007, NP_001028008).

15.3 PROCEDURE

The basic steps of tBLASTn are the same as for BLASTx:

15.3.1 Open the tBLASTn page

Open the NCBI home page by typing http://www.ncbi.nlm.nih.gov/ and click “tBLASTn”. Alternatively, it can also be opened by entering http://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM =tblastn&PAGE_TYPE = BlastSearch&LINK_LOC = blasthome in the URL.

The main page of tBLASTn will be displayed (Figure 15.1).

Image described by caption.

FIGURE 15.1 Homepage for tBLASTn at NCBI. The query sequence(s) can be entered with either accession numbers or sequence(s) in FASTA format.

15.3.2 Enter query sequences

  • Enter accession number(s) or FASTA sequence(s): Paste one or more protein query sequence(s) in FASTA format, or the respective NCBI accession number(s) (separated by Enter or Return key) for protein in the specified sequence box. Alternatively, a text file containing the amino acid query sequences (in FASTA format) could be uploaded by clicking the “Choose File” button.
  • Give a Job Title to identify the tBLASTn results from saved searches.
  • Checking “Align two or more sequences”: If this check box is checked, the page will be refreshed to provide the user with another sequence box, where the subject nucleotide sequence(s) is/are pasted.
  • Provide Query Sub‐range (optional): To specify a range of the input sequence that is to be searched against the database. This is especially useful when the GenBank accession number is used instead of the whole sequence itself.

15.3.3 Choose search set

  • Database: Choose any one of the nucleotide databases against which the search is to be made. The list of databases is almost the same as that for BLASTn, except for two options: “Human Genomics plus Transcript” and “Mouse Genomics plus Transcript” are absent.
  • Organism (Optional): Specify the organism (by common name or binomial name or taxonomical ID), if required. You can also check the small check box adjacent to the entry box to exclude any one or more (click on the “+” sign to add more organisms to be excluded) organisms from your search results.
  • Exclude Models (XM/XP) and/or Uncultured/environmental sample sequences (optional): Check one or both of the check boxes to exclude one or both of the options. Models (XM/XP) stands for the “model reference sequences”, determined and annotated from the Genome Annotation Project of NCBI and, thus, could be incomplete.
  • Entrez Query (optional): As with BLASTn, this is used to restrict the search to the specified Entrez query. It allows the Boolean operators, AND, OR, NOT, to define the database to be searched.
  • BLAST: Click on the button to initiate the tBLASTn search. Click the check box to open the search result in a new window.

15.4 ALGORITHM PARAMETERS

These are the same as those for BLASTx:

  1. General parameters
  2. Scoring parameters
  3. Filters and masking

15.5 INTERPRETATION OF tBLASTn RESULTS

  1. The output of tBLASTn is similar to that of BLASTp or BLASTx.
  2. The color key‐based alignment depiction and the table indicating the tBLASTn output for various homologous sequences are also the same as that for BLASTx (Figure 15.2).
  3. Individual pairwise alignment is also the same as that for BLASTp. However, the open reading frame out of all the possible six reading frames is indicated by “Frame”.
  4. Variants of a protein can also be identified from the tBLASTn results.
Image described by caption.

FIGURE 15.2 The results page of tBLASTn contains the color key‐based alignment display, followed by a tabular description of sequence alignments and, finally, alignment of each of the sequence pairs (query versus database sequences).

15.6 QUESTIONS

  1. 1. The given amino acid sequence is to be checked for possible transcript variants (transcripts of the same gene with varying length and encoded protein sequences) in non‐humped cattle:

    DPLKLATEVGNTENQQGSASKSKVEMSCEGSAEPSDTTTTLCVQESIYGISEIPLVSSGDGAKDPNDECEVNSGNSMPDLEAEEELSEDHSQIHGNSVVLTNSTEPASEDPFVADENSTE

  2. 2. Discover the protein homologs in the equine genome for the following genes, using taurine amino acid sequences as the query sequence: TSPY (Testis‐specific protein, Y‐encoded), Cathelicidin, TLR4.
  3. 3. Discuss the applications of tBLASTn.
  4. 4. Explain the result of tBLASTn given in Figure 15.2, systematically.
  5. 5. Assume that the tBLASTn tool is not working for some days (or is not available). How will you proceed to analyze a given novel amino acid sequence to annotate its encoding gene‐specific features?
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.113.190