Question: Mathematical question: Correlate ln transformed traits with log2 transformed expression data
gravatar for Tobias.Wohland
4.5 years ago by
Tobias.Wohland60 wrote:

Hi, I have the following problem:

I have expression data which I transformed via log2 and rsn (with lumi). Beside the usage for differential expression etc. I exported the expression of some interesting genes to correlate them with phenotypes of our cohort.

For the transformation of the cohort-traits we usually use the ln.

My question is now if I can correlate ln-transformed traits with log2-transformed expression data? Of course I can but is it correct? I did it with some random data and checked for log2, log10 and ln. The effect direction stays the same but of course the p-value and correlation coefficient differ. Of course it is no problem to transform the traits with lg2 but it would be interesting to know what you guys are thinking about the correlation of two different transformed traits. Is it mathematical correct?

Thanks in advance.


R • 2.0k views
ADD COMMENTlink modified 3.8 years ago by Michael Dondrup47k • written 4.5 years ago by Tobias.Wohland60

Mathematical correctness doesn't enter into this, this is a statistical question. You have to think from a reader's perspective: would it unnecessarily confuse the reader to have your logartithm base being 2 on one axis and e on the other. Probably. If every other figure in you paper uses log2 for the expression data, I'd probably log2 transform your response traits.  (Nonetheless, I don't think the choice of base should affect your p-value...)

ADD REPLYlink modified 4.5 years ago • written 4.5 years ago by russhh4.9k

I'm not sure I even understand the question. Can someone here to rephrase? 

ADD REPLYlink written 4.5 years ago by SmallChess500

Can you please be a bit more specific in what you did not understand. Or is it the whole question?

ADD REPLYlink written 4.5 years ago by Tobias.Wohland60
gravatar for Michael Dondrup
3.8 years ago by
Bergen, Norway
Michael Dondrup47k wrote:

I agree with russh's comment, that this is not related to correctness as long as you annotate the axis in your plots. If the log transformation linearizes your variables, then the base of the logarithm should have little effect on the correlation.

If in doubt, and in case no other transformations have been applied you can alway convert between them:

say: z = ln(x) <=> e^z = x
say: z = log2(x) <=> 2^z = x
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Michael Dondrup47k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1116 users visited in the last hour