log2 reverses mean in two groups
1
0
Entering edit mode
8.0 years ago

I have come across a weird problem. I a matrix of raw gene counts (RNASeq Level 3) and I have two groups - treated vs. untreated. I check the mean of raw counts in two groups, it is higher in the treated group. And then I take a log2 of the raw counts and check the mean in two groups again. I see the mean in treated group is smaller than in the untreated group.

I plotted the raw counts vs. log raw counts to see the distribution, the treated group has few points with large values whereas the untreated group has less outliers. I don't know what is causing the flip in means. Since log is a monotonic function, the mean in treated group should remain higher than the mean in untreated.

Has anyone faced a problem like this before?

Thank you.

R RNA-Seq • 1.8k views
ADD COMMENT
0
Entering edit mode
8.0 years ago
russhh 5.7k

Since when doing differential expression we typically work in terms of fold-changes, you should really be comparing the geometric means of the count numbers, not the arithmetic mean.

Consider the two sequences A = [1, 1] and B = [0.5, 2] The geometric mean of both A and B is 1 The arithmetic mean of A is 1, and of B is 1.25 So if we reduce either of the entries in B by a tiny amount,

i) its geometric mean would be slightly smaller than that of A, and

ii) its arithmetic mean would be slightly higher than that of A.

For example, comparing A = [1, 1] with B = [0.5, 1.75] should give you the same thing you've just seen: mean(A) = 1; mean(B) = 1.125

but,

mean(logA) = 0; mean(logB) ~ -0.067

ADD COMMENT

Login before adding your answer.

Traffic: 2688 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6