Question: Help needed with selecting top DE genes, Microarray data analysed with "limma"
gravatar for venu
4.8 years ago by
venu6.7k wrote:

Hello all,

I've analysed microarray data with limma package and ended up with a list of genes that are deferentially expressed. By default it has ranked DE genes based on B-statistics and from the reference manual(Page 4) I thought it would be a good parameter to rank. However from previous threads and some suggestions adjusted p-value would be a more useful parameter to select significant DE genes. I've observed that in my results adjusted p-values are much higher (0.1 - 0.9, nothing is <0.05). what might be the reason here? Should I repeat the analysis process with any changes? Or is it normal?

And I would be interested to mention that the top DE genes selected with B-statistics are actually correlating with the experiments we are doing in the lab (The list contains the number of genes that we were assuming to have differentially expressed). I am little bit biased with this post on significance values. 


  • Illumina platform
  • 12 completely individual cell lines (based on some experimetal results we've grouped them into 2, 6 in each)
  • Normalization - neqc function from limma.

I would like to provide more information, if required.


ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by venu6.7k
gravatar for andrew.j.skelton73
4.8 years ago by
andrew.j.skelton736.0k wrote:

This is quite a difficult question to answer without more detail of your experimental design. Useful information would be the platform you're using, normalisation method, tissue used, species, number of replicates, comparison of interest, etc. This could come down to a lot of different things, but I'd say power is the most obvious one, if you're seeing potentially high log odds ratios, then that can be indicative of something happening, but without an adjusted p value below a decent threshold (0.05), then it's not statistically significant in that dataset. The only other thing I can think of is possible technical variation, so you could use PCA, or methods from the SVA package to try and identify if there's any overwhelming technical variation present.

ADD COMMENTlink modified 10 months ago by RamRS30k • written 4.8 years ago by andrew.j.skelton736.0k

Thanks andrew for more interesting points to converge the problem. Question updated.

ADD REPLYlink written 4.8 years ago by venu6.7k

I don't really work with tumours, so someone more familiar with them may be able to comment on how that should impact relative to sample size. I'd suggest you take a probe targets a gene you're familiar with, and visualise the log2 expression relative to each sample, and see what it looks like, see how variable the samples are, and the means by sample type. Your normalisation choice is fine, you can feel free to update with your code so we can take a quick look through, but other than that, there's not much more I can suggest.

ADD REPLYlink modified 10 months ago by RamRS30k • written 4.8 years ago by andrew.j.skelton736.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1624 users visited in the last hour