Finding protein domains with PFAM and bio3d

Discovering the function of a protein sequence is a key task. We can do this in many ways, including by conducting whole sequence similarity searches against databases of known proteins using tools such as BLAST. If we want more informative and granular information, we can instead look for individual functional domains within a sequence. Databases such as Pfam and tools such as hmmer make this possible. Pfam encodes protein domains as profile Hidden Markov Models, which hmmer uses to scan sequences and report any likely occurrences of the domains. Often, genome annotation projects will carry out the searches for us, meaning that finding the Pfam domains in our sequence is a question of searching a database. Bioconductor does a great job of packaging up the data in these databases in particular packages—usually suffixed with .db. In this recipe, we'll look at how to work out whether a package contains Pfam domain information, how to extract it for specific genes of interest, and an alternative method for running a Pfam search yourself if there isn't any pre-existing information.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.42.81