CS Mukhopadhyay and RK Choudhary
School of Animal Biotechnology, GADVASU, Ludhiana
The restriction enzyme (RE) sites present on a nucleotide sequence can be detected using a suitable in silico tool. A nucleotide sequence is subjected to detection of RE site(s) in some wet‐lab experiments such as gene cloning, nucleotide sequencing, in vitro expression of a target protein, RFLP, AFLP, restriction mapping and restriction enzyme assays.
Several online tools for RE site detection are available. Some of the user‐friendly and accessible web‐based tools and their URLs (websites) have been tabulated below. Detailed procedures for determining RE sites using all these web tools will not be covered. This chapter will show how to use NEBCutter (New England Biolabs) as a tool for identifying RE sites.
To identify the RE site(s) present in a given nucleotide sequence.
This could be a nucleotide sequence of any length, depending on the purpose. Insert for gene cloning, or cDNA expression, an amplicon for detection of mutation (RFLP), or a DNA sequence which is to be screened for the presence of RE sites (RE assay).
This is hosted by New England BioLabs (Vincze et al., 2003); the URL is http://tools.neb.com/NEBcutter2/index.
There are several options to enter the sequence (one at a time) under study:
Users can also use the following options to make the search for RE sites more stringent by clicking “More Options” (Figure 7.2).
NEBCutter, by default, screens for type II REs (the cleaving location is adjacent to or within the recognition site, independent of methylase, and the REs are magnesium‐dependent). Checking this box will also enable NEBCutter to search for the Type I (cleaving location remote from recognition site; exerts both restriction and methylase activity, and are ATP‐dependent) and Type III (cleaving location similar to Type II; complexed with methylase and are ATP‐dependent) REs.
The endonucleases that catalyze the hydrolysis of genomic DNA within the cells synthesizing them. Check this box to include homing endonucleases.
These endonucleases cut only one strand of a double‐stranded DNA at a specific recognition site. Checking will include these enzyme sites for screening the input sequence.
The checked methylase(s) will be ignored while screening for overlapping methylation sensitivity of the enzymes. The methylation‐sensitive restriction enzymes (MSREs) cannot cleave methylated cytosine and, thus, are used to analyze methylated DNA and the methylation status of cytosine residues in CpG sequences:
A drop‐down list of open reading frames (ORFs) is available. The user needs to select the ORF, depending on the input sequence.
Check this when a partial (i.e., an in‐between fragment of a larger string) nucleotide sequence (missing Start or Stop codon) is submitted.
Only the specified region is prepared for screening.
The user can set colors for different portions of the graphical output, such as Scale, Cut‐Size (blunt, 5′ and 3′ extensions, highlighted ORF and so on).
Applicable for coding sequence. Minimize the size of the ORF if the coding sequence is shorter.
The user needs to provide a name to the sequence being analyzed, as an identifier.
These include a check box for “Disable NEBcutter cookies” and a button to “Delete projects”.
To initiate screening of the input sequence for RE site(s).
The output page displays the whole input sequence as a single line (if linear sequence option has been selected) with points of RE sites highlighted (Figure 7.3). Each of the RE shown on the RE‐site is hyperlinked (appearing as blue‐colored text), with the page containing detail for the RE. The color schema is displayed at the top‐right part of the output page. Note that the meaning of orange‐colored hash (#) indicates susceptibility of the enzyme to methylation caused by common methylases of E. coli origin. The asterisk (*) symbol indicates susceptibility of the enzyme to CpG methylation.
The page also contains five small panes with hyperlinked words. These have been tabulated and explained in Table 7.1. The explanations have been adopted from the help provided by the NEBCutter tool (http://tools.neb.com/NEBcutter2/help/main_display.html), so sometimes the lines may be verbatim.
TABLE 7.1 Meaning of different terminologies used in NEBCutter (Vince et al., 2003).
Term | Meaning |
Main Options | |
New DNA | This button, when clicked, opens the initial page |
Custom digest | To check digestion of input DNA sequence using a set of REs. These enzymes can be further categorized based on the type of the restriction end (blunt or 3’‐overhang or 5’‐overhang) or position of the restriction sites in the target sequence, e.g., REs sites within the input DNA sequence |
View sequence | To get the input sequence |
ORF summary | Tabulates following information about the genes that are displayed: coordinates of the genes; length of polypeptide; GenBank protein IDs of the respective gene sequences; single‐cutter REs |
Translate GB file | The ORF‐finder program of this tool predicts all the non‐overlapping, large open reading frames |
Save project | Saves the current project in the user’s local disk as a compressed file which can be again uploaded to the site later. |
To produce a printable file of the current project in PDF, EPS or GIF format. | |
Availability | |
All Commercial | Displays the REs commercially available from any agencies. The default is NEB‐produced REs. |
All | Displays all commercial appropriate literature cited, but not commercially available REs. |
Display | |
1, 2 or 3 cutters | The default is one cutter REs. The user can also specify displaying two or three cutters, separately. |
Alternative/Normal | “Alternative” will switch to alternative linear display for two and three cutter REs. Normal will display in the default fashion for all REs together on the scale. |
Zoom | |
Zoom in or Unzoom | This enlarges a selected region to a higher resolution (up to base level). More will pop up a window to specify the coordinates to be displayed. |
List | |
0, 1 or 2 cutters | The page contains a list of REs as specified by the users (non‐cutter/single‐cutter/double‐cutter). The table contains the name of the enzyme and the RE site (specificity) which can be saved as a text file. The user can also modify the search on some cutters in this page. |
All sites | Enlists all the RE sites according to their location along the input sequence |
Save all sites | To save the list of all sites in computer, in *.txt format. The name of the file will be the same as the name of the sequence given by the user. |
Flanking enzymes | This is very useful for some genetic studies. The user can identify the REs for the regions flanking a target region on the input sequence. |
> XM_010797953|CSNA2
A T G C C A T T A A A T A C T A T A T A T A A A C A A C C A C A A A A T C A G A T C A T T A T C C A T T C A G C T C C T C C T T C A C T T C T T G T C C T C T A C T T T G G A A A A A A G G A A T T G A G A G C C A T G A A G G T C C T C A T C C T T G C C T G C C T G G T G G C T C T G G C C C T T G C A A G A G A G C T G G A A G A A C T C A A T G T A C C T G G T G A G A T T G T G G A A A G C C T T T C A A G C A G T G A G G A A T C T A T T A C A C G C A T C A A T A A G A A A A T T G A G A A G T T T C A G A G T G A G G A A C A G C A G C A A A C A G A G G A T G A A C T C C A G G A T A A A A T C C A C C C C T T T G C C C A G A C A C A G T C T C T A G T C T A T C C C T T C C C T G G G C C C A T C C A T A A C A G C C T C C C A C A A A A C A T C C C T C C T C T T A C T C A A A C C C C T G T G G T G G T G C C G C C T T T C C T T C A G C C T G A A G T A A T G G G A G T C T C C A A A G T G A A G G A G G C T A T G G C T C C T A A G C A C A A A G A A A T G C C C T T C C C T A A A T A T C C A G T T G A G C C C T T T A C T G A A A G G C A G A G C C T G A C T C T C A C T G A T G T T G A A A A T C T G C A C C T T C C T C T G C C T C T G C T C C A G T C T T G G A T G C A C C A G C C T C A C C A G C C T C T T C C T C C A A C T G T C A T G T T T C C T C C T C A G T C C G T G C T G T C C C T T T C T C A G T C C A A A G T C C T G C C T G T T C C C C A G A A A G C A G T G C C C T A T C C C C A G A G A G A T A T G C C C A T T C A G G C C T T T C T G C T G T A C C A G G A G C C T G T A C T C G G T C C T G T C C G G G G A C C C T T C C C T A T T A T T G T C T A A G A G G A T T T C A A A G T G A A T G C C C C C T C C T C A C T T T T G A A T T G A C T G C G A C T G G A A A T A T G G C A A C T T T T C A A T C C T T G C A T C A T G T T A C T A A G A T A A T T T T T A A A T G A G T A T A C A T G G A A C A A A A A A T G A A A C T T T A T T C C T T T A T T T A T T T T A T G C T T T T T C A T C T T A A T T T G A A T T T G A G T C A T A A A C T A T A T A T T T C A A A A T T T T A A T T C A A C A T T A G C A T A A A A G T T C A A T T T T A A C T T G G A A A T A T C A T G A A C A T A T C A A A A T A T G T A T A A A A A T A A T T T C T G G A A T T G T G A T T A T T A T T T C T T T A A G A A T C T A T T T C C T A A C C A G T C A T T T C A A T A A A T T A A T C C T T A G G C A T A
> XM_010806178|CSNA1
A T G C C A T T A A A T A C T A T A T A T A A A C A A C C A C A A A A T C A G A T C A T T A T C C A T T C A G C T C C T C C T T C A C T T C T T G T C C T C T A C T T T G G A A A A A A G G A A T T G A G A G C C A T G A A G G T C C T C A T C C T T G C C T G C C T G G T G G C T C T G G C C C T T G C A A G A G A G C T G G A A G A A C T C A A T G T A C C T G G T G A G A T T G T G G A A A G C C T T T C A A G C A G T G A G G A A T C T A T T A C A C G C A T C A A T A A G A A A A T T G A G A A G T T T C A G A G T G A G G A A C A G C A G C A A A C A G A G G A T G A A C T C C A G G A T A A A A T C C A C C C C T T T G C C C A G A C A C A G T C T C T A G T C T A T C C C T T C C C T G G G C C C A T C C C T A A C A G C C T C C C A C A A A A C A T C C C T C C T C T T A C T C A A A C C C C T G T G G T G G T G C C G C C T T T C C T T C A G C C T G A A G T A A T G G G A G T C T C C A A A G T G A A G G A G G C T A T G G C T C C T A A G C A C A A A G A A A T G C C C T T C C C T A A A T A T C C A G T T G A G C C C T T T A C T G A A A G C C A G A G C C T G A C T C T C A C T G A T G T T G A A A A T C T G C A C C T T C C T C T G C C T C T G C T C C A G T C T T G G A T G C A C C A G C C T C A C C A G C C T C T T C C T C C A A C T G T C A T G T T T C C T C C T C A G T C C G T G C T G T C C C T T T C T C A G T C C A A A G T C C T G C C T G T T C C C C A G A A A G C A G T G C C C T A T C C C C A G A G A G A T A T G C C C A T T C A G G C C T T T C T G C T G T A C C A G G A G C C T G T A C T C G G T C C T G T C C G G G G A C C C T T C C C T A T T A T T G T C T A A G A G G A T T T C A A A G T G A A T G C C C C C T C C T C A C T T T T G A A T T G A C T G C G A C T G G A A A T A T G G C A A C T T T T C A A T C C T T G C A T C A T G T T A C T A A G A T A A T T T T T A A A T G A G T A T A C A T G G A A C A A A A A A T G A A A C T T T A T T C C T T T A T T T A T T T T A T G C T T T T T C A T C T T A A T T T G A A T T T G A G T C A T A A A C T A T A T A T T T C A A A A T T T T A A T T C A A C A T T A G C A T A A A A G T T C A A T T T T A A C T T G G A A A T A T C A T G A A C A T A T C A A A A T A T G T A T A A A A A T A A T T T C T G G A A T T G T G A T T A T T A T T T C T T T A A G A A T C T A T T T C C T A A C C A G T C A T T T C A A T A A A T T A A T C C T T A G G C A T A
18.218.132.6