How a protein sequence is classified in a HMM
1
1
Entering edit mode
4.4 years ago

I'm using Pfam HMMs to identify domains in my T. cruzi proteins with the HMMER program.

I kinda understand that a HMM is build by a family multiple sequence alignment, by calculating probabilities of states.

But how the hmmscan works to classify a new protein by families HMMs? Where can I read more about this? Academic references would be appreciated.

alignment sequencing HMMER hmmscan pfam • 961 views
ADD COMMENT
2
Entering edit mode

Have you tried Googling? The Durbin book should have a chapter on HMMs and HMM based classification is a relatively common topic. You could start with HMMER paper's reference section and work backwards until you get to an explanation of how and why it's done.

ADD REPLY
3
Entering edit mode
4.4 years ago
Mensur Dlakic ★ 27k

In this context, HMMs are numerical representations of sequence alignments that are more information-rich because both residue substitution patterns and gap penalties are treated in probabilistic fashion. Diverse sequences are given higher weight in HMMs, and they also include background residue probabilities in a way that is inversely proportional to the alignment depth.

If you don't have access to Krogh, Durbin & Eddy book, the PDF version is available here. Any of the early Krogh, Eddy and Karplus papers would be good as starting points.

https://www.ncbi.nlm.nih.gov/pubmed/8744772

https://www.ncbi.nlm.nih.gov/pubmed/9918945

https://www.ncbi.nlm.nih.gov/pubmed/9927713

https://www.ncbi.nlm.nih.gov/pubmed/18075166

ADD COMMENT

Login before adding your answer.

Traffic: 2962 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6