Question

Non -mathematical explanation of what "HMM" does (eg in GenScan)

1

Entering edit mode

6.7 years ago

catechize.2.learn ▴ 120

What is the (non-mathematical) interpretation of this figure of how gene scan works? enter image description here

Sorry I am more of a biologist than a statistician (or whatever the exact scientific field HMM belongs to!) Can it be interpreted as 1) :a sequence is defined as N, as the bases are added the sequence is being "checked" as remaining N or "going" to P (when, according to base content) the probability of being in N decreases from a threshold), and so on? 2) Or every possible combinations of placement (assignment of bases to states) of sequence on the markov chain is somehow checked and the most probable results are returned 3) or I am way off and better to leave the whole stuff?

HMM GenScan Explanation • 2.7k views

ADD COMMENT • link 6.7 years ago by catechize.2.learn ▴ 120

1

Entering edit mode

I would explain it as follows: as you can see in the image you attached, it's sort of like a flow diagram (https://www.sparxsystems.com.au/enterprise_architect_user_guide/14.0/guidebooks/tools_ba_data_flow_diagram.html), the difference is that it was built automatically on a set of sequences and it captures the essential characteristics of that set of sequences.

Then a new sequence is run through it, and the HMM assigns a probability that the input sequence belongs to the same group of sequences the HMM was built on, based on the presence of those characteristics that define the original set of sequences.

ADD REPLY • link 6.7 years ago by Raygozak ★ 1.4k

0

Entering edit mode

Think of a HMM as the ‘most likely’ variations of a given sequence. Instead of worrying about whether the first base is an A for example, instead its represented as a set of probabilities of being an A, T, C or G, and so on for the rest of the sequence.

ADD REPLY • link 6.7 years ago by Joe 22k

0

Entering edit mode

Thank you. How is it then, based on the probabilitis you mentioned, determined that whether a sequence belongs to 'gene' class or intergenic class?

ADD REPLY • link 6.7 years ago by catechize.2.learn ▴ 120

score 1 · Answer 1 · 2018-10-26

1

Entering edit mode

6.7 years ago

lieven.sterck 15k

it is an HMM interpretation of what a gene model should look like, where each 'state' represents a certain feature of a gene model (eg. exonic, intronic, UTR, ...)

If you follow the paths in this graph you capture all possible gene structures. Eg. from an exon you can only go to an intron, the edges (== the arrows) denote the probability of going from one state to another

here is nice pdf with some more info

ADD COMMENT • link 6.7 years ago by lieven.sterck 15k

0

Entering edit mode

Sorry I am more of a biologist than a statistician (or whatever the exact scientific field HMM belongs to!) Can it be interpreted as 1) :a sequence is defined as N, as the bases are added the sequence is being "checked" as remaining N or "going" to P (when the probability of being there decreases from a threshold), and so on? 2) Or every possible combinations of placement (assignment of bases to states) of sequence on the markov chain is somehow checked and the most probable results are returned
3) or I am way off and better to leave the whole stuff?

ADD REPLY • link 6.7 years ago by catechize.2.learn ▴ 120

0

Entering edit mode

This video is for protein alignments but perhaps you can find the explanation of HMM's useful. Note: I find this arcane as a primary biologist as well. So you are not alone.