Assigning proteins to pre-computed gene families
1
0
Entering edit mode
4.3 years ago
liorglic ★ 1.4k

Hello,
I have a set of protein sequences which I'd like to assign to pre-computed gene families e.g. those existing in Ensembl or Phytozome. The output I'm looking for is a gene family ID per protein sequence, or in case of novel sequences, some indication that they do not belong to any known family.
Is there an easy way to achieve this, like a feature in one of these databases or an external tool?
If not, can you suggest ways to perform this analysis? I guess I could just BLAST my sequences against all proteins in the DB and simply assign them to the family of the best hit (after applying some cutoffs on the alignment quality), but is that good enough?
Any advice would be appreciated, thanks!

gene family Ensembl Phytozome • 762 views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode
4.3 years ago
Mensur Dlakic ★ 27k

Download Pfam collection of hidden Markov models (what you need is this file, but make sure to gunzip it). Download and compile HMMer. There is a manual explaining all the commands, but most likely you will need to use hmmpfam.

ADD COMMENT

Login before adding your answer.

Traffic: 3118 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6