CHAPTER 28
Prediction of Secondary Structure of Protein

CS Mukhopadhyay and RK Choudhary

School of Animal Biotechnology, GADVASU, Ludhiana

28.1 INTRODUCTION

Secondary structures of a protein (e.g., α‐helix, β‐sheets, loops, and coils) are produced due to the patterns of hydrogen bonds between amino and carboxyl groups of the adjacent amino acids of the peptide backbone. The prediction of the secondary structure of proteins consists of a set of biocomputational approaches directed towards assigning the regions of local folding of protein (i.e., the secondary structure), based on the amino acid sequence (i.e., the primary structure).

28.2 OBJECTIVE

To predict the secondary structure of a given peptide molecule, using an online secondary structure prediction tool.

28.3 SECONDARY STRUCTURE PREDICTION USING ONLINE TOOL PSIPRED

28.3.1 Procedure

28.3.1.1 Download the amino acid sequence

Download a peptide sequence (e.g., taurine Keratin: NCBI protein Acc. No. Q148H4.1) from NCBI‐Protein (http://www.ncbi.nlm.nih.gov/protein/) or UniProt (http://www.uniprot.org/uniprot/) database, in FASTA format.

> Q148H4.1|BovineKeratin

M T C G S G F R G R A F S C V S A C G P R P G R C C I T A A P Y R G I S C Y R G L T G G F G S R S I C G G F R A G S F G R S F G Y R S G G V G G L N P P C I T T V S V N E S L L T P L N L E I D P N A Q C V K Q E E K E Q I K C L N N R F A A F I D K V R F L E Q Q N K L L E T K L Q F Y Q N R Q C C E S N L E P L F N G Y I E T L R R E A E C V E A D S G R L S S E L N S L Q E V L E G Y K K K Y E E E V A L R A T A E N E F V A L K K D V D C A Y L R K S D L E A N V E A L I Q E I D F L R R L Y E E E I R V L Q A H I S D T S V I V K M D N S R D L N M D N I V A E I K A Q Y D D I A S R S R A E A E S W Y R S K C E E I K A T V I R H G E T L R R T K E E I N E L N R V I Q R L T A E V E N A K C Q N S K L E A A V T Q A E Q Q G E A A L N D A K C K L A G L E E A L Q K A K Q D M A C L L K E Y Q E V M N S K L G L D I E I A T Y R R L L E G E E Q R L C E G V G S V N V C V S S S R G G V V C G D L C V S G S R P V T G S V C S A P C S G N L A V S T G L C A P C G P C N S V T S C G L G G I S S C G V G S C A S V C R K C

28.3.1.2 Open PSIPRED

Open PSIPRED using the URL http://bioinf.cs.ucl.ac.uk/psipred/:

  1. Paste the amino acid sequence in the sequence box. A short sequence identifier may be entered.
  2. Checkboxes offer options for multiple analyses in one go through Profile Based Fold Recognition (pGenTHREADER), Rapid Fold Recognition (GenTHREADER), etc.
  3. Click on the “Predict” button.
  4. Provide your email ID in the specified box to get the results mailed to your inbox. In general, analysis takes several hours to complete.
Graphical user interface (GUI) of PSIPRED with the selected Input tab, displaying the checked box for PSIPRED v3.3 (Predict Secondary Structure) under Choose Prediction Method.

FIGURE 28.1 Graphical user interface (GUI) of PSIPRED and filling the inputs in the Input tab.

28.3.2 Output

The output is available in three tabs on the same screen:

  • Summary: provides an easy‐to‐understand presentation of the secondary structure throughout the peptide sequence.
  • PSIPRED: thumbnails hyperlinked with the detailed result. Click on the thumbnail and the output is opened in a new window, which can be downloaded as a figure in.png format.
  • Downloads: The whole output or a part of it can be downloaded in different formats, viz.
    1. the whole output in a zipped file,
    2. only the results as plain text,
    3. the raw scores in plain text format,
    4. the results in Postscript or PDF format.

28.3.3 Interpretation of results

The output shows the type of secondary structure for each of the amino acids, using the following notations. Please note the proportions of the different types of secondary structure from the result.

The detailed result diagrammatically represents the different types of secondary structure predicted. Each of the blocks contains the following rows:

  • Confidence (Conf): The blue bars indicate the confidence (or probability of accuracy of prediction) for each of the amino acids. The taller the bar, the more confident we can be about the predicted secondary structure for that residue.
  • Prediction represented by the cartoon (Pred): The black, thin straight line represents coils (C), the yellow‐colored arrow represents beta sheet (E), and the pink‐colored barrel or pipe indicates helix (H).
  • Prediction coded by alphabetic notations (Pred): “H” for helix, “C” for coil, and “E” for beta sheet.
  • Amino acid sequence (AA): Each of the amino acid residues is placed in sequence.

Finally, the amino acid count is given as multiples of 10 (viz. 110, 20, 30, ….).

The output tabs of PSIPRED displayed in three sections: Summary output, detailed prediction in *.png format, and downloads.

FIGURE 28.2 The output tabs of PSIPRED shown in three sections.

28.4 SECONDARY STRUCTURE PREDICTION USING THE ONLINE CDM TOOL

The CDM tool has evolved from GOR (Garnier, Osguthorpe, and Robson) methods of secondary protein structure prediction.

28.4.1 Procedure

  1. Download an amino acid sequence: Download a peptide sequence in FASTA format.
  2. Open URL: http://gor.bb.iastate.edu/
  3. Open your workspace and enter your working mail ID.
  4. Input sequence: One can either enter the PDB ID or paste the amino acid sequence (maximum 1000 amino acids) into the specified sequence box. Please ensure that the sequence is in raw sequence format (not in FASTA format, and without any other alphabets or symbols, other than single‐letter alphabetic notations of 20 amino acids).
  5. Click on “Submit”.
  6. The result will be sent to the given email ID.
GUI of online CDM tool for prediction of protein secondary structures displaying the protein sequence, with Sequence name and Your e-mail address set to BovKeratin and csmbioinfo@gmail.com, respectively.

FIGURE 28.3 GUI of the online CDM tool for prediction of protein secondary structures.

28.5 QUESTIONS

  1. 1. Compare the secondary structures of caprine beta‐defensin (GenBank: ABF71365.1) versus bubaline lingual antimicrobial peptide (ABE66309.1) using the PsiPred tool.
  2. 2. The NCBI Protein Accession number for bubaline Dicer peptide is given as BAP00765.1. Predict the secondary structure of the following domains: PAZ, RIBOc, DSRM and RNaseIII.
  3. 3. Compare the secondary structures of the antimicrobial domains of the following cathelicidin variants: NCBI Protein Id: AGA63736.2, XP006065246.1, NP001277882.1.
  4. 4. What do you mean by “secondary structure of protein”? What are the applications of secondary structure prediction?
  5. 5. What are the constraints in predicting beta‐pleated sheets?
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.146.34.218