How to calculate the joint entropy H(X,Y,Z) where X, Y, Z are position of MSA
1
0
Entering edit mode
8.1 years ago
sivanc7 ▴ 10

Hi to all!

I want to perform a mutual information (MI) among three or more position of one multiple sequence alignment (MSA). So far, I know that

MI(XY) = sum( p(X,Y) * log ( p(X, Y) / ( p(X) * p(Y) ) ) )


But the MI among three position is:

MI(X,Y,Z) = MI(X,Y) + MI(X,Z) + MI(Y,Z) - [H(X) + H(Y) + H(Z) - H(X,Y,Z)]


where H(X) is calculated as:

H(X) = sum( p(X) * log p(X) )


The problem here is that I don't know how to calculate the joint entropy H(X,Y,Z). Someone knows how to calculate it? Also, if someone knows how to extend the MI among more than three position I will be grateful for the info.

Thanks to all! =)

RNA Mutual-Information Entropy • 3.5k views
0
Entering edit mode
0
Entering edit mode
8.1 years ago

I don't really understand why this matters. Why can you not take the minimum of the individual entropies of the different sequences? They are all equally valid. So, if two of them have very complex sequences, and the third is poly-A, well, you're not doing anything useful. I posit that the information content of a multiple sequence alignment is constrained by the most uninformative sequence you decide to include.