Question: Statistics: Getting The Standard Error For Expression Level Fold Change, Based On Geometric Averages
2
8.1 years ago by
David M550
David M550 wrote:

I have two sets of normalized expression data from a qPCR experiment. In order to perform my statistical analysis I've had to log transform the data from each set. I'd like to get a standard error associated with the mean of the log transformed set. Further, I'd like to back-transform (linearize) the data in order to get a meaningful fold-change (basically a ratio of the two means), along with an accompanying standard error.

This boils down to two questions:

• How can I calculate a standard error for a back-transformed log mean? In other words, how can back-transform the standard error of a set of log-transformed values?

• How can I incorporate the standard errors from two different back-transformed log means into a single standard error for the accompanying ratio of back-transformed log means?

I realize this might be a bit confusing, and I'm not sure where else to ask about it. Any help or pointers in the right direction would be greatly appreciated.

microarray statistics • 15k views
written 8.1 years ago by David M550
2

Are you sure that "back-transform" would linearize? I mean, if we use to "log-transform" data, it's to linearize exponentially distributed data no?

I'm not sure if this note helps: http://www.bmj.com/content/312/7038/1079.full

Are you sure that "back-transform" would linearize? I mean, usually, if we use to log transform data, it's to linearize exponentially distributed data no?

Fold-change using your log-transformed values is equally as "meaningful" as that from the original values. For example if you used log base 2, then a difference in means of 1 = a mean fold-change of 2; difference of 2 = fold-change of 4 and so on.

You should probably be doing everything in log space. If you get a standard error, you can always figure out the (now asymmetrical) confidence interval in linear space if needed.

You shouldn't 'back-transform', the log transform was most likely done for a reason, e.g. to make the distribution symmetric and more similar to a normal distribution. The standard error makes much more sense when the error is normally distributed.

1
8.1 years ago by
Woa2.7k
United States
Woa2.7k wrote:

I'm not sure if this note can help:

http://www.bmj.com/content/312/7038/1079.full

Some more notes in the series

http://www.medcalc.org/literature_notes.php

0
8.0 years ago by
David W4.7k
New Zealand
David W4.7k wrote:

Hi David,

I think you are on the right track with "infer in log-space, report in normal-space" - at least if the 'normal' numbers are more biologically meaningful. If I understand the two parts of your question then

1) The reason for applying the log-transformation was the skew in the data (i.e. more values one side of the mean than the other), so a single-number standard error for the back-transformed mean is probably not useful. Instead you want the mean +/- a standard error (or 1.96 of them if you want the 95% CI). That is, do all the calculations in log-space, and only back-transform the ranges:

``````back transformed mean = 10^log.mean
95% CI                = 10^(log.mean + log.stderror * 1.96 ), 10^(log.mean - log.stderror * 1.96 )
``````

(where log.mean is the mean of the transformed numbers, and log.stderror is their standard error). The new interval won't be symmetrical.

2) I don't know how to find the standard error of a ratio where each coefficient is itself normally distributed, I guess there will be a way and it will be hard! If your data can be easily resampled bootstrapping might be a better option? However you find the number, the ratio will be dimensionless, so I don't know that you want to back-transform it?

1

Be careful, only if the log-base was 10 is this correct. It could have been log2 (most likely) or natural log!

Good point Michael - for instance if you just type `log(x)` in R you get a natural log and need to use `exp` to back-transform