CHAPTER 21
Construction of Phylogenetic Tree: Unweighted‐Pair Group Method with Arithmetic Mean (UPGMA)

CS Mukhopadhyay and RK Choudhary

School of Animal Biotechnology, GADVASU, Ludhiana

21.1 INTRODUCTION

UPGMA is a clustering algorithm that works by joining the branches of a tree on the basis of maximum similarity criteria among pairs of sequences, and by calculating the means of joined pairs. UPGMA is “ultrametric”, so all the terminal nodes are equally distanced from the root. Hence, at the end, when a root is added, the rooted tree is produced.

  • Unweighted: It indicates equal contribution of all the pair‐wise distances. There is no weighting of any specific taxa‐pairs to indicate a different evolutionary rate compared with another pair(s). This is the opposite of the Weighted‐Pair Group Method with Arithmetic mean (WPGMA).
  • Pair‐groups: Any two taxa or any two clusters (clade) or one taxon and a cluster are always combined in pairs (that is, interpreted as dichotomies).
  • Arithmetic mean: Pair‐wise distance of each group is the mean distance to all members of that group.

21.2 ASSUMPTIONS

  1. Constant rate of evolution (i.e., mutation‐rate) amongst all the sequences.
  2. Distance data are ultrametric: This enables clustering by satisfying the “three point condition” to generate the tree.

21.3 OBJECTIVE

To construct a phylogenetic tree (dendrogram), using the UPGMA method, from a set of molecular sequences.

21.4 PROCEDURE

  1. Calculate the raw pair‐wise distance data from a set of sequences and construct a distance matrix:

    TABLE 21.1

    A B C D
    B 3
    C  7  7
    D 10 10 10
    E 10 10 10 8

    Note: while constructing the tree from the distance values, one needs to select the closest pair (with minimum distance among all possible pairs) from the distance matrix, and then merge these two objects to yield one.

  2. Identify the least‐distant pair: Here, the minimum distance is d(AB) = 3 (i.e., between the taxa “A” and “B”).
  3. Place these two taxa in a single group as a cluster and consider the duo as a single external node.
  4. The distance is d(AB) = AB = 3. Hence, the depth of divergence (for each branch) of this sub‐tree will be 3/2 = 1.5 units.
    Schematic illustrating UPGMA method, displaying a vertical line connecting two horizontal lines labeled A (top) and B (bottom), each with a depth of 1.5 equivalent to 1 unit.

    FIGURE 21.1

  5. Now consider “AB” as a single taxon and repeat the same steps, as above. Find the shortest distance and make the respective taxa a single cluster.
    images

    TABLE 21.2

    AB C D
    C 7
    D 10 10
    E 10 10 8
  6. Add the new taxa (C) with “AB” cluster, since “AB” and “C” have the least distance (i.e., 7), to produce the sub‐tree of ABC, in which the AB sub‐tree (drawn just in the last step) is attached to the point M. The length of AK + AN = OC.
    Image described by surrounding text.

    FIGURE 21.2

    For the sake of understanding, the internal nodes of the branches have been marked (as K, L, N, etc.). These are not required for tree construction in general.

  7. Now, repeat the last two steps, – that is, calculate the mean distance between (AB)C cluster and Sequence “D” and then draw the phylogenetic tree:
    images

    TABLE 21.3

    ABC D
    D 10
    E 10 8
  8. Now, the least distance is DE = 8. We will repeat the same steps as before: The branch length will be 8/2 = 4 units.
    Phylogenetic tree with branches QD (top) and RE (bottom). Branch length is 4.

    FIGURE 21.3

  9. Now, calculate the distance between these two clusters ABC and DE:
    images

21.5 INTERPRETATION OF UPGMA TREE

The distance from the root to the OUT of each cluster = 10/2 = 5 units.

Hence, the distance of TP = 5 – OC = 5 – 3.5 = 1.5 units.

The distance of US = 5 – 4 = 1.0 unit.

The UPGMA tree obtained in our example depicts evolutionary distances between the taxa. We need to add up the distances connecting these two taxa to calculate the distance between any two taxa (“A” and “D”): 1.5 + 2 + 1.5 + 1 + 4 = 10. This is exactly the value given in the distance table. Assuming equal evolutionary rates, these values indicate the evolutionary distances between the taxa.

Phylogenetic tree with two sub-trees.

FIGURE 21.4

Phylogenetic tree three sub-trees (left) and a table containing data (right).

FIGURE 21.5

21.6 QUESTIONS

  1. 1. Draw a phylogenetic tree manually using the following distance matrices:
    1. TABLE 21.4

      ABCDE
      B 8
      C1818
      D181810
      E181810 4
      F2020202020
    2. TABLE 21.5

      ABCDE
      A0
      B40
      C880
      D8860
      E88620
  2. 2. What are the merits and demerits of the UPGMA method of phylogenetic tree construction?
  3. 3. Explain in detail why ultrametric data are needed for UPGMA tree construction.
  4. 4. Under what circumstances do we prefer the UPGMA tree? How do you interpret the results of the UPGMA tree?
  5. 5. Suppose we have morphological data from which similarity and distance matrices can be constructed. Can we use such a distance matrix for the construction of a UPGMA tree? Justify your answer.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.182.211