CHAPTER 25 Construction of Phylogenetic Tree: Minimum Evolution Method
CS Mukhopadhyay and RK Choudhary
School of Animal Biotechnology, GADVASU, Ludhiana
25.1 INTRODUCTION
The minimum evolution (ME) method is a distance‐based method, and simple enough to avoid the least‐squares approach to determine the optimal tree with minimum branch length. The ME method described here is different from the maximum parsimony (MP) method in its approach. ME identifies the number of sites differing among the input sequences to produce the distance matrix, and ME yields an unrooted tree from the given set of sequences.
25.1.1 Principle
The tree length is obtained by summing up the character differences, and finally the tree with minimum branch length is reconstructed.
25.1.2 Assumptions
The character states (residues) change independently along the lineages.
Constant rate of evolution over lapse of time.
Additivity of the branch length.
The tree with the smallest summed branch length is the true one (Rzhetsky and Nei, 1993).
25.2 OBJECTIVE
To construct a phylogenetic tree using the ME method from a given set of nucleotide sequences.
25.3 PROCEDURE
Start with four (an arbitrarily chosen number) nucleotide sequences:
First, the sequences (S1–S4) are to be aligned:
TABLE 25.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
18
19
20
S1
C
G
T
A
G
G
A
T
A
G
A
C
A
T
A
G
G
C
S2
C
G
A
A
G
G
G
T
A
G
A
C
G
T
A
G
G
C
S3
C
A
A
A
G
G
A
G
A
G
C
T
A
T
G
G
G
C
S4
C
C
A
A
G
G
A
C
A
G
G
T
A
T
T
G
G
C
Next, determine the distance matrix by summing the number of different residues (designated by S) between each pair of sequences:
TABLE 25.2
S1
S2
S3
S4
S1
0
S2
3
0
S3
6
7
0
S4
6
7
4
0
The phylogenetic tree is constructed from the distances between sequence‐pairs, indicating the difference between each pair of sequences:
Just like UPGMA, the principle of ultrametric tree construction (assuming a molecular clock) is followed here. First, one needs to find out the sequence pairs (or taxa) with minimum distance (here, d(S1 – S2) = 3). Hence, S1 and S2 form a single cluster.
The ultrametric three‐point condition is checked for the taxa: dxy <= max(dxz, dyz) when x, y and z are three taxa taken into consideration (Desper and Gascuel, 2005).
Now, for four‐point condition, note that the sum of d(S1 – S2) + d(S3 – S4) <= d(S2 – S3) or d(S2 – S4). This implies that the two largest sums are equal (i.e., 7): d(S1 – S2) + d(S3 – S4) <= max{d(S1 – S3) + d(S2 – S4), d(S1 – S4) + d(S2 – S3)}
The tree‐metric is thus constructed, and this enables one to place S1 and S2 as a cluster and S3 and S4 as another cluster.
25.3.1 Difference between minimum evolution and maximum parsimony methods (Figure 25.1)
25.4 INTERPRETATION OF THE ME TREE
Here, the ME tree is inferred, and the corresponding differences with the MP tree are shown.
ME is a distance‐based method that selects the unrooted tree based on optimality criteria, whereas MP is a discrete method that determines the branch length from the fit of the individual residue of sequence‐pairs to a tree.
The ME tree only specifies the number of differences (as distance) in a branch for each pair of sequence. However, the position‐specific differences are not indicated. MP identifies each site of difference between any pair of sequences and shows this in the respective branches.
25.5 QUESTIONS
1. Construct a phylogenetic tree using the following homologous partial sequences by the ME method:
> Bbu
MASFRVKETVCPRTSQQPLEQCDFKENG
> BtaTV4
MTSFTVKETVCPRTSPQPPEQCDFKENG
> BtaTV2
MTSFTVKETVCPRTSPQPPEQCDFKENG
> BtaTV1
MVSFRVKETDCPRTSQQPLEQCDFKENG
> BtaTV3
MVSFRVKETDCPRTSQQPLEQCDFKENG
2. Construct a phylogenetic tree using this set of peptide sequences (ME method):
> Bbu
NELQSVRRFRPRRPRLPRPRPRPLPL
> BtaTV4
NELQSVR‐FRP‐PIRRPPIRP‐‐‐PF
> BtaTV2
NELQSVRRIRPRPPRLPRPRPRPLPF
> BtaTV1
NELQSVRRIRPRPPRLPRPRPRPLPF
> BtaTV3
NELQSVR‐FRP‐PIRRPPIRP‐‐‐PF
3. Now compare these two phylogenetic trees and explain what are the possible reasons for their differences, even though the source sequences are the same but from different portions of the peptide sequences.
4. Enumerate the differences between ME and MP trees. Also, explain the conditions when these two algorithms are best suited.
5. Discuss the assumptions and the practicality of the assumptions of the ME algorithm.