Question: log2 reverses mean in two groups
gravatar for sharmi.banerji
3.2 years ago by
sharmi.banerji0 wrote:

I have come across a weird problem. I a matrix of raw gene counts (RNASeq Level 3) and I have two groups - treated vs. untreated. I check the mean of raw counts in two groups, it is higher in the treated group. And then I take a log2 of the raw counts and check the mean in two groups again. I see the mean in treated group is smaller than in the untreated group.

I plotted the raw counts vs. log raw counts to see the distribution, the treated group has few points with large values whereas the untreated group has less outliers. I don't know what is causing the flip in means. Since log is a monotonic function, the mean in treated group should remain higher than the mean in untreated.

Has anyone faced a problem like this before?

Thank you.

rna-seq R • 827 views
ADD COMMENTlink modified 3.2 years ago by russhh4.7k • written 3.2 years ago by sharmi.banerji0
gravatar for russhh
3.2 years ago by
UK, U. Glasgow
russhh4.7k wrote:

Since when doing differential expression we typically work in terms of fold-changes, you should really be comparing the geometric means of the count numbers, not the arithmetic mean.

Consider the two sequences A = [1, 1] and B = [0.5, 2] The geometric mean of both A and B is 1 The arithmetic mean of A is 1, and of B is 1.25 So if we reduce either of the entries in B by a tiny amount,

i) its geometric mean would be slightly smaller than that of A, and

ii) its arithmetic mean would be slightly higher than that of A.

For example, comparing A = [1, 1] with B = [0.5, 1.75] should give you the same thing you've just seen: mean(A) = 1; mean(B) = 1.125


mean(logA) = 0; mean(logB) ~ -0.067

ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by russhh4.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 938 users visited in the last hour