CHAPTER 26
Construction of Phylogenetic Tree Using MEGA7

CS Mukhopadhyay and RK Choudhary

School of Animal Biotechnology, GADVASU, Ludhiana

26.1 INTRODUCTION

The Molecular Evolution and Genetic Analysis (MEGA) is a freely downloadable (for research and education) integrated tool for analyzing molecular data (nucleotide and protein sequences) and construction of phylogenetic trees. The latest version, MEGA 7.0.14, is used for some bio‐computational analyses, for example in: sequence alignment; determining the best evolutionary model; construction of phylogenetic trees as well as inferring ancestral sequences; mining online databases; estimation of divergence times; rates of molecular evolution; and testing evolutionary hypotheses. The software is freely available at http://www.megasoftware.net/ and can be run in Windows (both GUI and command‐line‐based), as well as in Linux and Mac operating systems (command‐line‐based).

In this chapter, we will see how a phylogenetic tree can be constructed using MEGA7 suit and inferred.

26.2 OBJECTIVE

To build a phylogenetic tree from a given set of molecular sequences.

26.3 PROCEDURE

26.3.1 Prepare the sequence file

Download and then arrange the molecular sequence data (nucleotide or amino acid sequences) in FASTA format, and save in a notepad (*.txt) file. It is not necessary that all the sequences should be of the same length, but the sequences should be homologous (depending on the hypothesis being tested in the experiment). The descriptive line of each FASTA formatted sequence may be shortened (Figure 26.1).

Mega Example SRY-Notepad containing molecular sequences.

FIGURE 26.1 Compile the unaligned, homologous molecular sequences in FASTA format in a text file.

26.3.2 Uploading data file/pasting the sequences

  1. Open MEGA7 and click on Align → Click on Edit/Build Alignment (the first option) in the drop‐down menu.
  2. A small dialogue box will appear with the options in a radio button.
    1. Create a new alignment: Select this if you are starting afresh. Selecting the option will direct the user to another window to select the type of input sequence data, DNA or amino acid. Select the correct option and proceed. Copy all the sequences (in FASTA format) and paste in the Alignment Explorer.
    2. Open a saved alignment session: Select this if you have already saved a previous alignment (on which someone has worked earlier). Select the file from the folder and proceed.
    3. Retrieve sequence from a file: Click if you want to upload a sequence from a text file. The text (.txt) file containing molecular sequences (in FASTA, PAUP, MEGA, ALN, Phylip, GCG, PIR, NBRF, MSF or IG formats) is opened in the sequence editor for further analysis (MSA).

26.3.3 Align the sequences

Click on “Alignment” on the menu bar and select any one of the two options in the drop‐down menu:

  1. Align by ClustalW: Opt for ClustalW when the input sequences are of comparable length and homologous (Figure 26.2).
  2. Align by Muscle: This option is preferred for sequences with considerably varying length, although belonging to the same super‐family. Out of these two algorithms, namely, ClustalW (progressive algorithm) and Muscle (iterative algorithm), the performance of Muscle is considerably good when the input sequences vary in sequence lengths.
MEGA7 interface with selected Alignment menu displaying a drop-down list.

FIGURE 26.2 Aligning the input sequences using either ClustalW or Muscle available in MEGA7 interface.

26.3.4 Save session

The alignment session can be saved as a *.mas file for future use (Figure 26.3).

MEGA interface with selected Data menu displaying the highlighted Save Session option.

FIGURE 26.3 Exporting the alignment file and saving the alignment session for further use.

26.3.5 Export alignment

The alignment data can be exported in any of the following file formats: MEGA, FASTA, PAUP.

Now, close the alignment explorer window to proceed for phylogenetic analysis.

26.3.6 Phylogenetic tree construction

Open the main window of MEGA7 and click on the “Phylogeny” tab in the menu bar. Select the algorithm you need for phylogenetic analysis from the drop‐down menu. Here we will choose the option “Construct/Test Neighbor‐Joining Tree”.

26.3.7 Selection of tree construction parameters

  1. Test of Phylogeny: Select “Bootstrap method” for re‐sampling of the branching pattern.
  2. Number of Bootstrap replications: Run 500 re‐samplings if the sequence length is long and/or the number of sequences is higher; else consider 1000 bootstrap‐replications. At least 100 bootstrap re‐samplings are suggested for validating the branching of constructed tree.
  3. Model/Method: The drop‐down menu displays a list of models (i.e., Number of differences, p‐distance, Poisson model, JTT, etc., depending on the algorithm chosen). It is better to run the program “Find Best DNA/Protein Models (ML)”, available under the “Models” tab in the menu bar (Figure 26.4). However, selection of model is time‐consuming, and is more applicable for the Maximum Likelihood‐based algorithm. We can, in general, select an advanced model such as Jones–Taylor–Thornton (JTT). Please remember that the NJ method assumes different rates of evolutionary changes, while the ME method assumes the same rate of transversion and transition. Thus, accordingly, select the model based on the method you opt for phylogenetic tree construction (Figure 26.5).
    MEGA interface displaying the selected Models tab with highlighted Find Best DNA Models (ML) in the option list.

    FIGURE 26.4 Selection of the best evolutionary model for further analyses.

    Top: MEGA interface with selected Phylogeny tab highlighting the Construct/Neighbor-Joining Tree in the option. Bottom: Options Summary tab with displayed parameters.

    FIGURE 26.5 Setting the parameters for phylogenetic analysis.

  4. Rates among Sites: There are two options (for nucleotide sequences as input): “Gamma Distributed” and “Uniform rates”. Opt for Gamma distributed if sequences are divergent enough.
  5. Gamma parameter: Gamma distribution is specified with Gamma parameter (or shape parameter varying from 1 to 5) for modeling the evolutionary rates. Here it is assumed that the substitution rate varies from site to site.
  6. Gaps/Missing Data Treatment: complete deletion.
  7. Now, run the analysis for tree construction, by clicking on “Compute”.

26.4 INTERPRETATION OF PHYLOGENETIC TREE

The phylogenetic tree displays the branch scale at the bottom.

  1. Node IDs: Each of the internal nodes is given discrete and unique numerical IDs for specification.
  2. Branch length: Each branch has a length (corresponding to the scale given at the bottom) that indicates the substitution of residues.
  3. Bootstrap value: This indicates the stability of the branching pattern but bears no relationship to the accuracy of the tree.

26.4.1 Controlling the output of phylogenetic tree

The generated tree can be manipulated to suit the requirement of a presentation by changing its size, branch positions, toggling the bootstrap values, branch length, etc. (Figure 26.6). Since MEGA is very user‐friendly software, everything can be controlled through the menu‐bar options, or the buttons displayed in the left‐hand side pane (Windows OS). Figure 26.7 clearly indicates the various buttons on the GUI for controlling the appearance of the tree.

MEGA interface with selected View tab displaying options in the Show/Hide menu. Taxon Label, Taxon Marker, and Branch scale in the options under Show/Hide menu are checked.

FIGURE 26.6 Controlling the display parameters using the menu bar parameters.

MEGA interface with selected Original tree tab displaying eleven different buttons on the left-hand side.

FIGURE 26.7 Controlling the tree display parameters using the left‐hand‐side buttons.

26.4.2 Diagrams for each of the taxa

These can be inserted as follows:

  1. Click on “Subtree” in Menu‐bar → Select Use Subtree draw options → Click on the “Image” tab of Subtree Drawing options” and select the image from the saved image in the particular folder (Figure 26.8).
    Left–right: Property, Display, and Image tabs with displayed properties.

    FIGURE 26.8 Insertion of figures for the external nodes (species name).

  2. Save the Phylogenetic Tree: Click on the “Image” option in the menu bar → Click on “Save as PNG file” (Figure 26.9).
    MEGA interface displaying a phylogenetic tree. On upper left of the interface is the selected Image tab with the highlighted Save as PNG file. At bottom is a scale bar as branch scale.

    FIGURE 26.9 Saving the output phylogenetic tree as a PNG file.

26.5 QUESTIONS

  1. 1. Construct a phylogenetic tree using the neighbor‐joining method, with bootstrap re‐sampling of 500, using a set of homologous protein sequences.
  2. 2. Consider the previous example and increase the bootstrap re‐sampling to 1000. Is there any change in the branching pattern reliability values (i.e., bootstrap values)? Display the tree so that only bootstrap values of more than 75 are shown in the nodes.
  3. 3. Construct a phylogenetic tree with the following algorithms: ME, NJ, UPGMA, Maximum Likelihood. Now, compare the trees using the protein sequences: NP001272506.1 AAI20478.1 CAH23217.1 XP005909397.1 XP005955229.1. The bootstrap re‐sampling should be 500 for all the algorithms. Please determine the best evolutionary model before running the phylogeny analysis.
  4. 4. Determine the best model for phylogenetic tree construction using the following nucleotide sequences, and then construct a circular phylogenetic tree with bootstrap re‐sampling and minimum evolution algorithm: AB974690.1 AB973433.1 NM001009772.1 NM001009406.1 NM001009787.1 NM001285577.1
  5. 5. Interpret the given output generated by MEGA using the NJ method:
    Phylogenetic tree with branches labeled KP027016¬WeissellaSp, NR074540ΙBacillusCereus, U26053ΙMS U26053, AY036903ΙDesulfotomaculumKuznetsovi, HG792421 ΙPantoeaStewartii, etc.

    FIGURE 26.10

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.91.187