Finding Genetic Variants with HTS Data

High-Throughput Sequencing (HTS) has made it possible to discover genetic variants and carry out genome-wide genotyping and haplotyping in many samples in a short space of time. The deluge of data that this technology has released has created some unique opportunities for bioinformaticians and computer scientists, and some really innovative new data storage and data analysis pipelines have been created. The fundamental pipeline in variant calling starts with the quality control of HTS reads and the alignment of those reads to a reference genome. These steps invariably take place before analysis in R and typically result in a BAM file of read alignments or a VCF file of variant positions (see the Appendix of this book for a brief discussion of these file formats) that we'll want to process in our R code. 

As variant calling and analysis is such a fundamental technique in bioinformatics, Bioconductor is well equipped with the tools we need to construct our software and perform our analysis. The key questions researchers will want to ask will range from Where are the genetic variants on my genome? to How many are there? to How can I classify them? We'll look at some recipes to address these questions and also look at other important general techniques that allow us to visualize variants and markers on a genome and assess associations of variants with genotypes. We'll also look at other definitions of the term genetic variant and see how we can assess the copy number of individual loci.

In this chapter, we will cover the following recipes:

  • Finding SNPs and indels in sequence data using VariantTools
  • Predicting open reading frames in long reference sequences
  • Plotting features on genetic maps with karyoploteR
  • Finding alternative transcript isoforms
  • Selecting and classifying variants with VariantAnnotation
  • Extracting information in genomic regions of interest
  • Finding phenotype and genotype associations with GWAS
  • Estimating the copy number at a locus of interest
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.19.17