Question

Mathematical question: Correlate ln transformed traits with log2 transformed expression data

1

Entering edit mode

10.0 years ago

Tobias.Wohland ▴ 70

Hi, I have the following problem:

I have expression data which I transformed via log2 and rsn (with lumi). Beside the usage for differential expression etc. I exported the expression of some interesting genes to correlate them with phenotypes of our cohort.

For the transformation of the cohort-traits we usually use the ln.

My question is now if I can correlate ln-transformed traits with log2-transformed expression data? Of course I can but is it correct? I did it with some random data and checked for log2, log10 and ln. The effect direction stays the same but of course the p-value and correlation coefficient differ. Of course it is no problem to transform the traits with lg2 but it would be interesting to know what you guys are thinking about the correlation of two different transformed traits. Is it mathematical correct?

Thanks in advance.

Best,
Tobi

R • 4.5k views

ADD COMMENT • link updated 2.6 years ago by Ram 45k • written 10.0 years ago by Tobias.Wohland ▴ 70

3

Entering edit mode

Mathematical correctness doesn't enter into this, this is a statistical question. You have to think from a reader's perspective: would it unnecessarily confuse the reader to have your logarithm base being 2 on one axis and e on the other. Probably. If every other figure in you paper uses log2 for the expression data, I'd probably log2 transform your response traits. (Nonetheless, I don't think the choice of base should affect your p-value...)

ADD REPLY • link updated 2.6 years ago by Ram 45k • written 10.0 years ago by russhh 5.8k

0

Entering edit mode

I'm not sure I even understand the question. Can someone here rephrase?

ADD REPLY • link updated 2.6 years ago by Ram 45k • written 10.0 years ago by scchess ▴ 640

0

Entering edit mode

Can you please be a bit more specific in what you did not understand. Or is it the whole question?

ADD REPLY • link updated 2.6 years ago by Ram 45k • written 10.0 years ago by Tobias.Wohland ▴ 70

Ram · Answer 1 · 2016-03-12

I agree with russh's comment, that this is not related to correctness as long as you annotate the axis in your plots. If the log transformation linearizes your variables, then the base of the logarithm should have little effect on the correlation.

If in doubt, and in case no other transformations have been applied you can always convert between them:

say: z = ln(x) <=> e^z = x
say: z = log2(x) <=> 2^z = x