Finding DNA motifs with universalmotif can be done using the following steps:
- First, load the libraries and a motif of interest:
library(universalmotif)
library(Biostrings)
motif <- read_matrix(file.path(getwd(), "datasets", "ch3","simple_motif.txt"))
- Then, load in sequences to scan with the motif:
sequences <- readDNAStringSet(file.path(getwd(), "datasets", "ch3", "promoters.fa"))
- Perform a scan of the sequences:
motif_hits <- scan_sequences(motif, sequences = sequences)
motif_hits
Note that motif_hits contains information about the position of the motif in each of the target sequences.
- Calculate whether the motif is enriched in the sequences:
motif_info <- enrich_motifs(motif, sequences, shuffle.k = 3, verbose = 0, progress = FALSE, RC = TRUE)
motif_info
Note that motif info contains information about statistical enrichment in a set of sequences.
- Run MEME to find novel motifs:
meme_path = "/Users/macleand/miniconda2/bin/meme"
meme_run <- run_meme(sequences, bin = meme_path, output = "meme_out", overwrite.dir = TRUE)
motifs <- read_meme("meme_out/meme.txt")
view_motifs(motifs)