CHAPTER 28 Prediction of Secondary Structure of Protein
CS Mukhopadhyay and RK Choudhary
School of Animal Biotechnology, GADVASU, Ludhiana
28.1 INTRODUCTION
Secondary structures of a protein (e.g., α‐helix, β‐sheets, loops, and coils) are produced due to the patterns of hydrogen bonds between amino and carboxyl groups of the adjacent amino acids of the peptide backbone. The prediction of the secondary structure of proteins consists of a set of biocomputational approaches directed towards assigning the regions of local folding of protein (i.e., the secondary structure), based on the amino acid sequence (i.e., the primary structure).
28.2 OBJECTIVE
To predict the secondary structure of a given peptide molecule, using an online secondary structure prediction tool.
28.3 SECONDARY STRUCTURE PREDICTION USING ONLINE TOOL PSIPRED
M T C G S G F R G R A F S C V S A C G P R P G R C C I T A A P Y R G I S C Y R G L T G G F G S R S I C G G F R A G S F G R S F G Y R S G G V G G L N P P C I T T V S V N E S L L T P L N L E I D P N A Q C V K Q E E K E Q I K C L N N R F A A F I D K V R F L E Q Q N K L L E T K L Q F Y Q N R Q C C E S N L E P L F N G Y I E T L R R E A E C V E A D S G R L S S E L N S L Q E V L E G Y K K K Y E E E V A L R A T A E N E F V A L K K D V D C A Y L R K S D L E A N V E A L I Q E I D F L R R L Y E E E I R V L Q A H I S D T S V I V K M D N S R D L N M D N I V A E I K A Q Y D D I A S R S R A E A E S W Y R S K C E E I K A T V I R H G E T L R R T K E E I N E L N R V I Q R L T A E V E N A K C Q N S K L E A A V T Q A E Q Q G E A A L N D A K C K L A G L E E A L Q K A K Q D M A C L L K E Y Q E V M N S K L G L D I E I A T Y R R L L E G E E Q R L C E G V G S V N V C V S S S R G G V V C G D L C V S G S R P V T G S V C S A P C S G N L A V S T G L C A P C G P C N S V T S C G L G G I S S C G V G S C A S V C R K C
Paste the amino acid sequence in the sequence box. A short sequence identifier may be entered.
Checkboxes offer options for multiple analyses in one go through Profile Based Fold Recognition (pGenTHREADER), Rapid Fold Recognition (GenTHREADER), etc.
Click on the “Predict” button.
Provide your email ID in the specified box to get the results mailed to your inbox. In general, analysis takes several hours to complete.
28.3.2 Output
The output is available in three tabs on the same screen:
Summary: provides an easy‐to‐understand presentation of the secondary structure throughout the peptide sequence.
PSIPRED: thumbnails hyperlinked with the detailed result. Click on the thumbnail and the output is opened in a new window, which can be downloaded as a figure in.png format.
Downloads: The whole output or a part of it can be downloaded in different formats, viz.
the whole output in a zipped file,
only the results as plain text,
the raw scores in plain text format,
the results in Postscript or PDF format.
28.3.3 Interpretation of results
The output shows the type of secondary structure for each of the amino acids, using the following notations. Please note the proportions of the different types of secondary structure from the result.
The detailed result diagrammatically represents the different types of secondary structure predicted. Each of the blocks contains the following rows:
Confidence (Conf): The blue bars indicate the confidence (or probability of accuracy of prediction) for each of the amino acids. The taller the bar, the more confident we can be about the predicted secondary structure for that residue.
Prediction represented by the cartoon (Pred): The black, thin straight line represents coils (C), the yellow‐colored arrow represents beta sheet (E), and the pink‐colored barrel or pipe indicates helix (H).
Prediction coded by alphabetic notations (Pred): “H” for helix, “C” for coil, and “E” for beta sheet.
Amino acid sequence (AA): Each of the amino acid residues is placed in sequence.
Finally, the amino acid count is given as multiples of 10 (viz. 110, 20, 30, ….).
28.4 SECONDARY STRUCTURE PREDICTION USING THE ONLINE CDM TOOL
The CDM tool has evolved from GOR (Garnier, Osguthorpe, and Robson) methods of secondary protein structure prediction.
28.4.1 Procedure
Download an amino acid sequence: Download a peptide sequence in FASTA format.
Open your workspace and enter your working mail ID.
Input sequence: One can either enter the PDB ID or paste the amino acid sequence (maximum 1000 amino acids) into the specified sequence box. Please ensure that the sequence is in raw sequence format (not in FASTA format, and without any other alphabets or symbols, other than single‐letter alphabetic notations of 20 amino acids).
Click on “Submit”.
The result will be sent to the given email ID.
28.5 QUESTIONS
1. Compare the secondary structures of caprine beta‐defensin (GenBank: ABF71365.1) versus bubaline lingual antimicrobial peptide (ABE66309.1) using the PsiPred tool.
2. The NCBI Protein Accession number for bubaline Dicer peptide is given as BAP00765.1. Predict the secondary structure of the following domains: PAZ, RIBOc, DSRM and RNaseIII.
3. Compare the secondary structures of the antimicrobial domains of the following cathelicidin variants: NCBI Protein Id: AGA63736.2, XP006065246.1, NP001277882.1.
4. What do you mean by “secondary structure of protein”? What are the applications of secondary structure prediction?
5. What are the constraints in predicting beta‐pleated sheets?