Hi,
I used to believe that all column sums in a PFM (or position count matrix) should be equal since the matrix is derived from aligned read counts. But it seems not to be the case until I found this JASPAR matrix MA0466.2:
http://jaspar.genereg.net/cgi-bin/jaspar_db.pl?ID=MA0466.2&rm=present&collection=CORE
MA0466.2 CEBPB A [10585 33 17 10930 393 3521 2527 22732 22732 155 ] C [ 5007 34 14 34 22732 555 22732 636 9 10617 ] G [ 6282 8 2276 11803 221 22732 3 9 0 940 ] T [ 859 22732 22732 2237 2217 486 2135 0 9 12116 ]
column sum: 22733 22807 25039 25004 25563 27294 27397 23377 22750 23828
This matrix is from HT-SELEX. Can anyone tell me why this happens? Small fragments in the DNA pool?
Thanks
I think I got the answer from this paper: http://genome.cshlp.org/content/20/6/861.full.pdf+html