This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
32
|
Chapter 2: Biological Sequences
Molecular Clocks
If you compare the sequences from related organisms, it is clear that certain posi-
tions don’t change much over time while others change very rapidly. For example,
parts of the ribosomal RNA are identical in every organism sequenced to date, from
bacteria to humans. These subsequences are so important that if they change, the
organism dies. Clearly, these are under intense selective pressure. There are other
sites, such as third codon positions, that are only mildly affected by selection and
tend to drift. There are even sequences, such as viral coat proteins, in which selec-
tion acts to promote variation, and these change very rapidly. Regardless of the
underlying mechanism, it is possible to use the rate of change as a molecular clock.
If you know the mutation rate for a particular sequence, you can use it to determine
how long ago two sequences diverged. Suppose you have the same protein sequence
from both cats and dogs, and there are 10 differences between them. From the fossil
record, you estimate that cats and dogs had a common ancestor 50 million years ago.
Now when you compare the cat sequence to the same sequence in humans, you find
12 differences. You can now estimate that carnivores and humans shared a common
ancestor 60 million years ago. We’re using a very simple model here that treats all
positions identically and we’re not using real data, but this is the general idea behind
molecular clocks.
The key to using molecular clocks is that the sequences must “tick” at the appropri-
ate rate. The hypothetical protein in the last example is a poor choice for determin-
ing how long ago humans and chimps last shared a common ancestor because one
difference here or there would lead to a large difference in the estimated time.
Sequences that tick too fast are also not appropriate because they are prone to satu-
ration.
Homology, Phylogeny, and Trees
When looking at the biological world around you, you see only what exists today.
You can’t get a clear picture of what the world looked like 100 million years ago.
However, you can see relationships between organisms and make inferences. For
example, you don’t know what the last common ancestor of humans, chimpanzees,
and gorillas looked like, but you can guess that it looked more like an ape than a
bird. This is also the case at the sequence level; proteins from humans and chimps
are much more similar to each other than either is to a bird. The study of relation-
ships between organisms is called phylogenetics.
By definition, two sequences are homologous if they share a common ancestor. Two
sequences are either homologous or they aren’t. However, people often misuse the
term and say something like “these two sequences are 80 percent homologous.”
What they usually mean is that two sequences are 80 percent identical and not that
there is an 80 percent chance that they have a common ancestor. Determining if two
This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
Evolution
|
33
sequences are indeed homologous requires making inferences. This isn’t always a
simple task; sometimes homology can be stated with near certainty, but not always.
Sequences may appear to be related from chance similarity (or convergent evolution).
Sequence homology is further refined by the terms orthologous and paralogous.
Sequences separated by speciation are called orthologs, while sequences separated by
duplication are called paralogs. The genes for myoglobin in humans and mice are
orthologs; they are the same gene in different species. If the myoglobin gene is dupli-
cated in humans, the two myoglobins will be paralogs of each other. It’s somewhat
confusing, but both human paralogs would be considered orthologous to the mouse
myoglobin. It is generally the case that the most similar genes between species are
orthologs, and this is often used as an operational definition.
The Tree of Life
An introduction to molecular evolution would be incomplete without an overview of
life on Earth. You may have learned in an introductory biology class that there are
five taxonomic kingdoms (animals, plants, fungi, monera, and protista). This is
based largely on what can be seen with your eyes or a microscope. Molecular biol-
ogy opened up a new way to classify organisms based on sequences rather than
external features. Figure 2-4 shows a tree for various organisms based on ribosomal
DNA sequence. There are three obvious domains that Carl Woese called the Bacte-
ria, Archaea, and Eucarya. Note that the arrow in the figure points to the root of the
plants, animals, and fungi. From this perspective, the traditional five kingdoms are a
bit nearsighted.
gIn terms of genomes and overall cell structure, there are only two major divisions:
the prokaryotes (bacteria and archaea) and eukaryotes. Except in rare cases, prokary-
otes are microscopic organisms that are usually shaped like rods or spheres. Some of
the more famous prokaryotes include Escherichia coli (a bacterium that lives in your
gut and is a favorite model organism for microbiologists) and Yersinia pestis (the bac-
terium that causes bubonic plague). The major distinguishing feature of prokaryotes
is that DNA replication, transcription, and translation all take place in the same
compartment of the cell because there is only one compartment in the cell.
Eukaryotes come in many shapes and sizes, primarily because they can form multi-
cellular organisms such as birds and trees. But some eukaryotes are simple, single-
celled organisms such as Saccharomyces cereviseae (the yeast used for making beer).
All eukaryotes have a nucleus (karya is Greek for nucleus) in which DNA is stored,
in addition to other membranous organelles. Interestingly, most eukaryotes contain
mitochondria. These organelles have their own genome and are descended from bac-
teria that long ago entered a cooperative relationship with eukaryotes. This is also
true of chloroplasts, which are responsible for photosynthesis in plants. It is thought
that eukaryotes are a fusion of two bacteria, one a Eubacteria and one an Archaebac-
This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
34
|
Chapter 2: Biological Sequences
Figure 2-4. Tree of life based on rRNA sequence (Diagram courtesy of Norman Pace. Used with
permission.)
pOPS66
pOPS19
Heliobacterium
Bacillus
Clostridium
Chloroflexus
Thermus
Thermotoga
EM17
Aquifex
Arthrobacter
Gloeobacter
Synechococcus
Leptonema
Chlorobium
Chlamydia
chloroplast
BACTERIA
0.1 changes / site
Root
Zea
Giardia
Trichomonas
Vairimorpha
Encephalitozoon
Physarum
Dictyostelium
Porphyra
Babesia
Paramecium
Costaria
Achlya
Coprinus
Homo
Cryptomonas
Naegleria
Entamoeba
Trypanosoma
Euglena
EUCARYA
ARCHAEA
Flexibacter
Flavobacterium
Planctomyces
Desulfovibrio
Rhodocyclus
Escherichia
Agrobacterium
mitochondrion
Gp. 2 low temp
Gp. 1 low temp
marine Gp. 1 low temp
pSL 12
pSL 22
pSL 50
Sulfolobus
Pyrodictium
Thermofilum
Thermoproteus
pJP 78
pJP 27
Methanobacterium
marine low temp
Archaeoglobus
Methanospirillum
Haloferax
Methanopyrus
Methanothermus
Thermococcus
Thermoplasma
Methanococcus
Gp. 3 low temp
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.130.199