Splitting sequence files into OTUs can be done using the following steps:
- Load the data and compute the OTUs:
library(kmer) library(magrittr)
library(ape)
seqs <- ape::read.fastq(file.path(getwd(), "datasets", "ch5","fq", "SRR9040914ab.fq.gz") otu_vec <- otu(seqs, k = 6, threshold = 0.99 )
- Count the frequency of each OTU cluster:
data.frame(
seqid = names(otu_vec),
cluster = otu_vec,
row.names = NULL) %>%
dplyr::group_by(cluster) %>%
dplyr::summarize(count = dplyr::n() )