Question: How to parse HMMSCAN output to enable comparison of domain architecture for several proteins
gravatar for peter pfand
4.0 years ago by
peter pfand100
peter pfand100 wrote:

Dear community,

I have a FASTA file with several (co-)ortholog proteins, from 2 different species, whose domain architecture I want to know (1). Next, I would like to get, for each protein, a sequence of likely true domains (2), and then, I'd like to compare such domains (3): the presence/absence and the order of appearance.

steps 1 & 2: I can do this manually for a small set of proteins in a FASTA file, but it turns out too tedious when I have 1000 FASTA files. Does anyone know any parser/tool to retrieve the significant domains for every protein from hmmscan output?

step 3: I have found metrics such as WDAC (Weighted Domain Architecture Comparison, see WDAC), ADASS (alignment-free domain architecture similarity search, see ADASS) and DA-score (Domain Architecture similarity score, see DA-score), but I couldn't manage to find any benchmark/comparison of those three or others. Does anyone know which method of those three is the most accurate/best or whether there are others?

I am a quite newbie working on this and feel a bit lost.

Thanks a lot in advance

ADD COMMENTlink written 4.0 years ago by peter pfand100
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2141 users visited in the last hour