CS Mukhopadhyay and RK Choudhary
School of Animal Biotechnology, GADVASU, Ludhiana
A two‐dimensional (2D) plot depicting one or more of the various sequence features (sequence similarities, direct and/or inverted repeats, motifs, gaps, sequence inversions, etc.) is called a dot plot. A single sequence, or two different sequences (with the same type of residues), can be studied to reveal the hidden sequence features. Dot plot has been used for local (not global) alignment, and was identified as a very powerful tool for molecular sequence analysis as early as during the late 1960s (Fitch, 1969).
To compare two homologous molecular sequences using a dot plot.
Molecular sequences can be subjected to dot plot analysis using online tools like Dotlet, Dotter, and so on.
Two different sequences, or a single sequence, can be placed along the vertical and the horizontal axes of a matrix for analysis using a dot plot. The query and the subject sequences are placed along rows (Y‐axis) and columns (X‐axis), respectively. Next, a dot is placed in the cells, where the two axes have the same residue. Thereby, a subset of the sequence which has a run of identical residues will form a straight line (Figure 8.1).
There are two main parameters optimized during a dot plot analysis: window size and mismatch limit.
This determines the run of residues that must match in both sequences. If the specified number of residues at a stretch is matching, the graph will not indicate any mark of dot(s). Window size, thus, monitors the background noise. The smaller the window size, the more background noise there will be. Again, a very high window size will produce a clean plot, devoid of any indication of sequence similarity.
This parameter allows one to tolerate a specified number of mismatches, thereby indicating the stretches of residues with sequence similarity. The limit specified by different software ranges from 1 to 3.
Please note that these dot plot analyses have been done using http://www.vivo.colostate.edu/molkit/dnadot/ and https://wssp.rutgers.edu/StudentScholars/WSSP08/Dot plotter/Dot plotPractice.html?destination=StudentScholars/WSSP08/Dot plotter/Dot plotPractice.html, online tools which are no longer available.
Dot plot analysis reveals several sequence features at a glance. Some examples have been given below:
The sequence‐pair being compared using dot plot may differ due to insertion(s)/deletion(s) at one or more positions. These InDels are reflected by a break in the straight line (Figure 8.1). Insertion in the horizontal sequence (or deletion in the vertical sequence) will necessitate horizontal movement, and a break in the straight line and insertion in the vertical sequence (or deletion in the horizontal sequence) will be indicated by vertical movement and discontinuity in the straight line. The third base (i.e., “C”) of the horizontal sequence and the ninth base (i.e. “G”) of the vertical sequence are the insertions (highlighted yellow in the second diagram, Figure 8.1) in those respective sequences.
The presence of repeat sequence(s) can be detected by a dot plot (Figure 8.2). The same sequence is placed along the horizontal and vertical axes. There is four fold repetition of the same sequence “TACGGCTACAGTCACG”, intervened by short tetramers of different sequences:
T A C G G C T A C A G T C A C G G G G G T A C G G C T A C A G T C A C G C C C C T A C G G C T A C A G T C A C G A A A A T A C G G C T A C A G T C A C G A C C C C C T A T A A A A G C T C A G T G A G C G C C C G C G G T A A A T G T A C C T G T C A C C C T A C A G C G A C C T C T G C C A G A C C
In the dot plot result, we find one diagonal line representing the full sequence, and some short fragmented lines parallel to the diagonal. These short lines represent the repeat sequences. Every fragment stands for alignment of the repeats with each other.
A nucleotide sequence may produce a stem‐loop secondary structure when it has a palindromic sequence intervened by a short sequence. Similarly, there may be an inversion in the other half of a given sequence. Dot plot analysis can reveal such features. Inverted sequences will produce a main diagonal line between the other two corners (the corners adjacent to the end terminals of the sequences) of the matrix. Smaller diagonal lines are symmetrically parallel to the main diagonal, which indicates that the same repeat is there in the sequence in tandem.
>Seq1
M M N R V Q P E N V H S T I F T P R E Y Q V E L V D A C L K G N T L S V L A S R S T R T F L I T M V T R E M A H L V D A C L K G N T L S V L A S R S T R T L T R S K E Q G G K G Q L V D A C L K G N T L S V L A S R S T R T R T L L T G W S G P G L V R A G E A I Q Q N T N L A V T T Y T R L E Q V D G W L P S R W S H T F T E A Q V I I M T V D V L E K G L E T G L L Q L D M L N L L V I T D A H R V A T M M N R V Q P E N V H S T I F T P R E Y Q V E L V D A C L K G N T L S V L A S R S T R T F L I T M V T R E M A H L V D A C L K G N T L S V L A S R S T R T L T R S K E Q G G K G Q L V D A C L K G N T L S V L A S R S T R T R T L L T G W S G P G L V R A G E A I Q Q N T N L A V T T Y T R L E Q V D G W L P S R W S H T F T E A Q V I I M T V D V L E K G L E T G L L Q L D M L N L L V I T D A H R V A T
>Seq2
T A V R H A D T V N M D G T G K V D V T M V A T T H S W R S W G D V R T Y T T V A N T N A G A R V G G S W G T T R T R T S R S A V S T N G K C A D V G K G G K S R T T R T S R S A V S T N G K C A D V H A M R T V M T T R T S R S A V S T N G K C A D V V Y R T T S H V N V R N M M T A V R H A D T V N M D G T G K V D V T M V A T T H S W R S W G D V R T Y T T V A N T N A G A R V G G S W G T T R T R T S R S A V S T N G K C A D V G K G G K S R T T R T S R S A V S T N G K C A D V H A M R T V M T T R T S R S A V S T N G K C A D V V Y R T T S H V N V R N M M
3.12.34.253