Hello, I'm using HMMER3 to perform a search of a single immunoglobulin sequence against a database, iteratively updating the HMM profile and seed family, by using the function jackhmmer.
My question is regarding search space: I could download any relevant database files from pfam (e.g. V-set).... But I already have UniRef100 on my machine.
Would there be any particular benefit to performing my search against V-set rather than UniRef100? I believe that all V-set sequences have been generated from UniProt in the first place, and therefore the V-set is a subset of UniRef100?
When it comes to searching sequence alignments, what is the benefit of using pfam or TIGRfam versus an entire database (other than computational speed time to reduced search space?)
I suppose I could run jackhmmer on V-set regardless, concatenate the alignments, and eliminate redundant sequences. Any suggestions? Thank you!