Question

Why Cannot Limma Package Do Differential Expression Between Two Samples Without Replication?

0

Entering edit mode

10.9 years ago

shangzhong0619 ▴ 20

I am trying to do differential expression of microarrays with limma package. I usually compare two groups. However, when each group has only one sample, it failed when I run to the syntax eBayes. My code is got from the GEO2R. When one group contain more than one sample, it works. Does anyone know why is that? How to do differential expression between only two samples? Thank you.

affymetrix limma differential-expression replicates • 13k views

ADD COMMENT • link updated 6.3 years ago by Biostar 20 • written 10.9 years ago by shangzhong0619 ▴ 20

0

Entering edit mode

See also: Microarray data WITHOUT replicates. Any hope to get something out of it? for a more general view. This is sort of a universal answer. Btw, I do not recommend to close this question because it has a slightly different focus.

ADD REPLY • link 10.9 years ago by Michael 54k

0

Entering edit mode

Just curious, how many replicates do you require to get good results from LIMMA. In other words does LIMMA have any recommendation on the number of replicates?

ADD REPLY • link 10.9 years ago by Woa ★ 2.9k

0

Entering edit mode

This kind of analysis is called power analysis. If you search for "power analysis microarray data" you will find several tools, e.g. the BioC package SSPA.

ADD REPLY • link 10.9 years ago by Michael 54k

3

Entering edit mode

10.9 years ago

Simon Cockell 7.4k

In order to apply a t-test between two samples (which is, at the end of it, what limma is doing - albeit in a fancy way) we need to know three things about the two samples.

The mean
The sample size
The variance (or more specifically the standard deviation)

In your case you have an estimate of the mean (the measurement of expression for a given gene on your one replicate per sample) and the sample size (1), but you have no way of calculating the variance (if we assume the two populations your samples are taken from are independent - a reasonable assumption, given the hypothesis we are testing).

Even if your one measurement gave us a good estimate of the mean (and this is by no means certain - there are many, many reasons why your sample could be an outlier), without the variance we simply have no route to calculate the t-statistic for the tests.

For this fairly simple technical reason, it is impossible to apply limma (or indeed any valid statistical test) in a situation where you have no replicates in at least one of your sample groups. This is just one of the many reasons why replicates are absolutely required when doing microarray (and indeed, RNA-Seq) experiments.

ADD COMMENT • link 10.9 years ago by Simon Cockell 7.4k

0

Entering edit mode

Hi, Simon. Can you explain a little bit more on how limma works. I could understand T test but not quite clear on limma. Thanks in advance,

ADD REPLY • link 10.9 years ago by Tky ★ 1.0k

0

Entering edit mode

Did you read the Limma package vignette? It is quite clear how Limma works and is well explained by a very smart author and Limma package creator.

ADD REPLY • link 6.3 years ago by theobroma22 ★ 1.2k

Michael · Accepted Answer · 2013-06-12

This goes down to the basics of statistics. Though I am not a stats-expert I will try to clarify the issue a bit.

For each gene that was measured by your microarrays you are asking limma whether the mean gene expression of sample 1 is different from the mean gene expression of sample 2 and whether the difference in these means is not explained by the random difference you could expect given the random variance you get from microarray profiling in the first place. However, as you have only 1 sample in each of the two groups, the mean of each group will just be the same as the gene expression of the samples, and the variance of each group cannot be estimated because you only have 1 sample in each group.

Compare it with the situation where you ask whether 1 person called Peter is significantly taller than 1 person called Susan. He is just bigger, you cannot do any statistics on that. If you ask whether men in general are bigger than women, and you have measured 20 males and 20 females you can take the mean of both groups and the variance within the groups and ask whether the difference in mean length between the males and females is significant.

I hope someone can explain it in more statistically-sound terms