CHAPTER 41
In Silico Mining of Simple Sequence Repeats (SSR) Markers

Mir Asif Iquebal, Sarika and D Kumar

CABiN, ICAR‐IASRI, New Delhi, India

41.1 INTRODUCTION

Microsatellites are simple sequence repeats (SSRs), where repeat units are di‐, tri‐ tetra‐ or penta‐nucleotides. A common repeat motif in birds is (AC)n, where the two nucleotides A and C are repeated n number of times (n ranges from 8 to 50). They tend to occur in non‐coding regions of the DNA, but a few human genetic disorders are caused by microsatellite falling in coding regions. There are many tools for mining microsatellite markers.

41.2 OBJECTIVE

To learn how to mine simple sequence repeats (SSR) markers in a given DNA sequence.

A number of tools for mining microsatellite markers from genome are available in the public domain. Examples include Repeatmasker (www.repeatmasker.org/; Smit et al., 1996), Sputnik (http://espressosoftware.com/pages/sputnik.jsp; Abajian, 1994) Tandem Repeats Finder (TRF) (http://tandem.bu.edu/trf/trf.html; Benson, 1999), MISA (http://pgrc.ipkgatersleben.de/misa/; Theil et al., 2003), SSRIT (Temnykh et al, 2001), and others.

41.3 MISA (MICROSATELLITE IDENTIFICATION TOOL)

This can be found at: http://pgrc.ipk‐gatersleben.de/misa/misa.html (Figure 41.1). Requirements for MISA installation are:

  1. Windows XP operating system (minimum 512 MB RAM, Pentium IV processor), or Linux‐based system.
  2. Perl is to be installed.
Image described by caption.

FIGURE 41.1 MISA homepage.

41.3.1 MISA installation

Copy the file (Figure 41.2) and save it in a text document as misa.pl.

Image described by caption.

FIGURE 41.2 Download misa.pl.

Copy the misa.ini file (Figure 41.3) and save it in a text document as misa.ini. After installation of misa.pl and misa.ini, microsatellites can be identified using the ./misa.pl FASTAfile.

Image described by caption.

FIGURE 41.3 Download misa.ini.

41.3.2 Objective

To mine SSRs from a given sequence:

>sequence

A A T T C G G C A C C A G T A A A T T T T C C C A A A G G T T T C A A A A A T G A A A A T T T T G A T T T T C C T A A T A A T G T T T C T T G C T A T G T T G C T A G T A A C A A G T G G G A A T A A T A A T C T A G T A G A G A C A A C A T G C A A G A A C A C A C C A A A T T A T A A T T T G T G T G T G A A A A C T T T G T C T T T A G A C A A A A G A A G T G A A A A A G C A G G A G A T A T T A C A A C A T T A G C A T T A A T T A T G G T T G A T G C T A T T A A A T C T A A A G C T A A T C A A G C T G C T A A T A C T A T T T C A A A A C T T A G G C A T T C T A A T C C T C C T C A A G C T T G G A A A G A T C C T T T G A A G A A T T G T G C C T T T T C G T A T A A G G T A A T T T T A C C A G C A A G T A T G C C A G A A G C A T T A G A A G C A T T A A C A A A A G G T G A T C C A A A A T T T G C A G A A G A T G G A A T G G T T G G T T C T T C T G G T G A T G C A C A A G A A T G T G A A G A A T A T T T T A A A G C T A C A A C T A T T A A A T A T T C A C C A C T T T C T A A A T T A A A T A T A G A T G T T C A T G A A C T T T C T G A T G T T G G T A G A G C C A T T G T A A G A A A T T T A T T G T A A T A T G T C A T G T C A T A A T G T T A C A T A T C G A A A A G T T T T T A T A G T T T A G T T T G A T A G A C T G T C T G A A T T A T T A T T T T A T T C T T G C T A G T A A A A A T T C G A T T C G T C A C A T T A T G A T C A T C T G T G G T T C A T T T T T C T T T T T T C T A C C T C A A A T G T T A T G T G T G T A T C C C C T C T T A A T T A T T A T A A G A A A A A T A T A T C A T A A A T A T T T G T A C A A G T G T A A T A C T C T T A T C C A A T A T A T A T G T T K G Y C C C C T T C T A A A A A A A A A A A A A A A A A A A A A A A A A A A

41.3.3 Procedure

  1. Copy and paste the above sequence in notepad and save it as “testfile.fasta”.
  2. Start the command prompt and change the directory (i.e., specify the path) where your misa.pl and misa.ini have been saved.
  3. The input file is to be placed in the same directory where misa.pl and misa.ini are placed – or, give the correct path to your input file.
  4. Type the command: <perl misa.pl testfile.fasta> (Figure 41.4).
    Image described by caption.

    FIGURE 41.4 The command prompt where code is written.

41.4 RESULT

  1. Two files are generated, namely testfile.misa (Figure 41.5) and testfile.statistics (Figure 41.6).
    Image described by caption.

    FIGURE 41.5 The output, as seen in testfile.misa.

    Notepad window screen displaying the output in testfile.statistics.

    FIGURE 41.6 The output, as seen in testfile.statistics.

  2. For the given sequence, we get 1 SSR, which is “mono” type repeated 27 times. The start and end positions are 802 and 828, respectively.
  3. Primers can be designed for the marker obtained using the Primer3 tool.

41.5 QUESTIONS

  1. 1. From NCBI, trace ESTs for “Sesame”. How many hits do you get? Copy the top 20 hits to perform SSR mining from the selected hits.
  2. 2. How many SSRs do you get from the selected FASTA sequences?
  3. 3. What is the number of SSR‐containing sequences?
  4. 4. How many sequences contain more than 1 SSR?
  5. 5. What are the repeat types of the mined SSRs? Discuss in detail.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.39.255