Building Hidden Markov Model (HMM) for proteins
1
0
Entering edit mode
5.9 years ago

Hi,

I have 460 amino acid sequences for a specific protein family X. I want to make a HMM model for those sequences. I will use that HMM to search homologs of X / neighbour of X in some bacterial genomes. How can I make a HMM model ? Is there any software available? What is the step to do that?

Thanks in advance.

sequence • 4.9k views
ADD COMMENT
1
Entering edit mode
5.9 years ago
toheitka ▴ 230

The HMMER software (which is well documented) can be used to produce HMMs from alignments.

As for searching DNA with protein HMMs:

  • nhmmer can be used to build nHMMs from nucleotide alignments. They can be used to query DNA.
  • hmmer can be used to build HMMs from amino acid alignments. They can be used to query proteins.

The current HMMER manual (2015, version 3.1) states: "Still missing: Translated comparisons. We’d of course love to have the HMM equivalents of BLASTX, TBLASTN, and TBLASTX. They’ll come."

(Of course, you could always translate your genome in all frames , chop it up, and then screen it using protein HMMs. It is a bit ugly, you might run into some frameshift trouble, but maybe it works?)

ADD COMMENT
0
Entering edit mode

To give a bit more details, to build a HMM for a set of proteins, the steps are:
- build a multiple sequence alignment with e.g. Clustal
- run hmmbuild (from the HMMER package) with the multiple sequence alignment as input

ADD REPLY
0
Entering edit mode

Thanks. From their manual I came to know, I need to give .sto file as input to get a .hmm file. Does Clustal give .sto file as output? Or I have to use different software to convert my file after clustal?

Cheers

ADD REPLY
0
Entering edit mode

As far as I remember, hmmbuild can read alignments in several formats. Just check the docs.

ADD REPLY

Login before adding your answer.

Traffic: 1805 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6