This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
32
|
Chapter 2: Biological Sequences
Molecular Clocks
If you compare the sequences from related organisms, it is clear that certain posi-
tions don’t change much over time while others change very rapidly. For example,
parts of the ribosomal RNA are identical in every organism sequenced to date, from
bacteria to humans. These subsequences are so important that if they change, the
organism dies. Clearly, these are under intense selective pressure. There are other
sites, such as third codon positions, that are only mildly affected by selection and
tend to drift. There are even sequences, such as viral coat proteins, in which selec-
tion acts to promote variation, and these change very rapidly. Regardless of the
underlying mechanism, it is possible to use the rate of change as a molecular clock.
If you know the mutation rate for a particular sequence, you can use it to determine
how long ago two sequences diverged. Suppose you have the same protein sequence
from both cats and dogs, and there are 10 differences between them. From the fossil
record, you estimate that cats and dogs had a common ancestor 50 million years ago.
Now when you compare the cat sequence to the same sequence in humans, you find
12 differences. You can now estimate that carnivores and humans shared a common
ancestor 60 million years ago. We’re using a very simple model here that treats all
positions identically and we’re not using real data, but this is the general idea behind
molecular clocks.
The key to using molecular clocks is that the sequences must “tick” at the appropri-
ate rate. The hypothetical protein in the last example is a poor choice for determin-
ing how long ago humans and chimps last shared a common ancestor because one
difference here or there would lead to a large difference in the estimated time.
Sequences that tick too fast are also not appropriate because they are prone to satu-
ration.
Homology, Phylogeny, and Trees
When looking at the biological world around you, you see only what exists today.
You can’t get a clear picture of what the world looked like 100 million years ago.
However, you can see relationships between organisms and make inferences. For
example, you don’t know what the last common ancestor of humans, chimpanzees,
and gorillas looked like, but you can guess that it looked more like an ape than a
bird. This is also the case at the sequence level; proteins from humans and chimps
are much more similar to each other than either is to a bird. The study of relation-
ships between organisms is called phylogenetics.
By definition, two sequences are homologous if they share a common ancestor. Two
sequences are either homologous or they aren’t. However, people often misuse the
term and say something like “these two sequences are 80 percent homologous.”
What they usually mean is that two sequences are 80 percent identical and not that
there is an 80 percent chance that they have a common ancestor. Determining if two