Non -mathematical explanation of what "HMM" does (eg in GenScan)
1
1
Entering edit mode
5.5 years ago

What is the (non-mathematical) interpretation of this figure of how gene scan works? enter image description here

Sorry I am more of a biologist than a statistician (or whatever the exact scientific field HMM belongs to!) Can it be interpreted as 1) :a sequence is defined as N, as the bases are added the sequence is being "checked" as remaining N or "going" to P (when, according to base content) the probability of being in N decreases from a threshold), and so on? 2) Or every possible combinations of placement (assignment of bases to states) of sequence on the markov chain is somehow checked and the most probable results are returned 3) or I am way off and better to leave the whole stuff?

HMM GenScan Explanation • 2.0k views
ADD COMMENT
1
Entering edit mode

I would explain it as follows: as you can see in the image you attached, it's sort of like a flow diagram (https://www.sparxsystems.com.au/enterprise_architect_user_guide/14.0/guidebooks/tools_ba_data_flow_diagram.html), the difference is that it was built automatically on a set of sequences and it captures the essential characteristics of that set of sequences.

Then a new sequence is run through it, and the HMM assigns a probability that the input sequence belongs to the same group of sequences the HMM was built on, based on the presence of those characteristics that define the original set of sequences.

ADD REPLY
0
Entering edit mode

Think of a HMM as the ‘most likely’ variations of a given sequence. Instead of worrying about whether the first base is an A for example, instead its represented as a set of probabilities of being an A, T, C or G, and so on for the rest of the sequence.

ADD REPLY
0
Entering edit mode

Thank you. How is it then, based on the probabilitis you mentioned, determined that whether a sequence belongs to 'gene' class or intergenic class?

ADD REPLY
1
Entering edit mode
5.5 years ago

it is an HMM interpretation of what a gene model should look like, where each 'state' represents a certain feature of a gene model (eg. exonic, intronic, UTR, ...)

If you follow the paths in this graph you capture all possible gene structures. Eg. from an exon you can only go to an intron, the edges (== the arrows) denote the probability of going from one state to another

here is nice pdf with some more info

ADD COMMENT
0
Entering edit mode

Sorry I am more of a biologist than a statistician (or whatever the exact scientific field HMM belongs to!) Can it be interpreted as 1) :a sequence is defined as N, as the bases are added the sequence is being "checked" as remaining N or "going" to P (when the probability of being there decreases from a threshold), and so on? 2) Or every possible combinations of placement (assignment of bases to states) of sequence on the markov chain is somehow checked and the most probable results are returned
3) or I am way off and better to leave the whole stuff?

ADD REPLY
0
Entering edit mode

This video is for protein alignments but perhaps you can find the explanation of HMM's useful. Note: I find this arcane as a primary biologist as well. So you are not alone.

ADD REPLY
0
Entering edit mode

thank you @genomax .

ADD REPLY

Login before adding your answer.

Traffic: 1431 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6