Performing Hmmer search against pfam versus UniRef100
1
0
Entering edit mode
4.7 years ago
baverso • 0

Hello, I'm using HMMER3 to perform a search of a single immunoglobulin sequence against a database, iteratively updating the HMM profile and seed family, by using the function jackhmmer.

My question is regarding search space: I could download any relevant database files from pfam (e.g. V-set).... But I already have UniRef100 on my machine.

Would there be any particular benefit to performing my search against V-set rather than UniRef100? I believe that all V-set sequences have been generated from UniProt in the first place, and therefore the V-set is a subset of UniRef100?

When it comes to searching sequence alignments, what is the benefit of using pfam or TIGRfam versus an entire database (other than computational speed time to reduced search space?)

I suppose I could run jackhmmer on V-set regardless, concatenate the alignments, and eliminate redundant sequences. Any suggestions? Thank you!

hmmer pfam alignment • 1.6k views
ADD COMMENT
3
Entering edit mode
4.7 years ago
Mensur Dlakic ★ 27k

It depends on what are your exact goals. V-set is a subset of UniRef100, but not necessarily the whole subset of V-set proteins from the present version of UniRef100. Current Pfam version is almost a year old, so its sequences are at most representative of a UniRef100 dataset from a year ago. If you are interested in a fairly comprehensive set of homologs for your protein, searching a subset of Pfam sequences will do the trick. You'd likely go to recent UniRef100 if you are interested in ALL available homologs.

At the risk of stating the obvious, also consider this: if your protein has more than one domain - something other than V-set - searching against the V-set sequences will not yield any matches for those additional domains.

ADD COMMENT

Login before adding your answer.

Traffic: 1940 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6