Question: A lot of differentially expressed genes
gravatar for mahdijalili
5.1 years ago by
Iran, Islamic Republic Of
mahdijalili20 wrote:


There are more than 17,000 genes (probes) differentially expressed (adjusted p-value < 0.05) among 48,000 genes (probes) in a microarray analysis (by limma R package, 22 case samples vs 8 control). Is it correct and acceptable?


deg microarray • 2.1k views
ADD COMMENTlink modified 5.1 years ago by svlachavas680 • written 5.1 years ago by mahdijalili20
gravatar for svlachavas
5.1 years ago by
svlachavas680 wrote:

Dear Mahdijalili,

what is your experimental design regarding your analysis ? For instance the case samples are cancer samples ? If so, you could expect a lot of DE between cancer and control samples. Or the case samples represent some drug treatment ? Nevertheless, if you could give us more information about your procedure(specific microarray platform, if you have performed any non-specific filtering prior DE testing etc). Generally,

  •  I believe that there is no general characterization as "correct" or acceptable, as it depends on the biology of your system and your samples/analysis. Also, limma is capable of taking into account the imbalance between the case and the control samples.


  • Finally, just to pinpoint that generally you should not consider DEG genes only with a adjusted p-value threshold and just use the cutoff to get a first pool of DEG candidates. You can use the functions treat & topTreat from limma to test differential expression against a minimum log-fold change cutoff. It will reduce the number of your deg genes, and also probably will represent a more biologically meaningful subset of genes for your analysis.




ADD COMMENTlink modified 5.1 years ago • written 5.1 years ago by svlachavas680

Thanks Efstathios,

Yes, cases are un-treated APL (M3 leukemia) vs normal BM samples. Thanks for your finally suggestion. Also, are there any similar functions for package like RankProd (Rank product analysis)?

ADD REPLYlink written 5.1 years ago by mahdijalili20

In this case, the big number of DEG genes(based on the adjusted p-value criterion only) can be explained by the generally large differences between your leukemia samples and normal ones. Regarding the RankProd analysis, im familiar with it but i would not reccomended it, as it is usually for a few samples and in simple words just "ranks the log-fcs". Limma is far more powerful and with many more capabilities for handling any issue like small samples sizes or inbalanced studies.

In fact, you could use treat() after lmFit(), instead of eBayes step-and then use topTreat() like topTable to return the subset of the DEG candidates. Check ?treat() from the limma package. For instance you can use an lfc=0.5, to give you genes with at least higher lfc-in this case bigger than 1.5 fold change.

Finally, i would also consider non-specific intensity filtering to remove low expressed probes in most samples-which would be uniformative for any further analysis.

ADD REPLYlink modified 5.1 years ago • written 5.1 years ago by svlachavas680
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1862 users visited in the last hour