Should I use DNA or Protein sequence to build a HMM profile?
2
0
Entering edit mode
7.9 years ago
liuyifan2014 ▴ 110

Hello my friends,

I want to build a HMM profile to search against a DNA fasta file. I use the hmmbuild to build the HMM and hmmsearch to search.

However, I am not sure should I use DNA sequence or protein sequence to construct the HMM ? Since the fasta files is composed of DNA sequence, I am afraid that the protein HMM profile do not work on it. If I construct a DNA HMM profile, there are also problems like the orientation of the protein-encoding genes and the degenerate codon.

Do you have any idea? Thank you for any help!

genome hmm gene protein DNA • 4.5k views
ADD COMMENT
2
Entering edit mode
7.9 years ago

If you want a model for proteins then build it using protein sequences. If you want to model some sort of ORF finding process that's dependent on the resulting protein sequence (enjoy dealing with splicing) then use the DNA sequence.

ADD COMMENT
1
Entering edit mode
7.9 years ago
oli4morelle ▴ 20

hmmer does not support searching DNA with a protein query. This is possible with Blast.

You could translate your fasta in all 6 reading frames. Than you can search with a protein HMM. Nevertheless if you get partial Hits I would check the corresponding DNA sequences for inserts, deletions and introns.

nhmmer is also an option. It is maybe less sensitive, but the orientation is not a problem. The subject sequences are searched in both directions. The advantage is that inserts and deletions have less effect on the hit length.

ADD COMMENT

Login before adding your answer.

Traffic: 2473 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6