Question: p-value in Limma vs Graphpad
1
gravatar for LuisNagano
4 weeks ago by
LuisNagano20
University of Campinas
LuisNagano20 wrote:

Hello guys,

I have some statistical issue. When I analyze my microarray data using RMA normalization, i get pvalue and adj-p from differential analysis. But when I go to graphpad to make analysis using the same normalized data, the result gives me a very different pvalue. I know that necessarily the limma will give me different values from graphpad, but when I think of bar graph comparing two columns, can i use the limma pvalue to plot the significant statistic or I need to use adj-p to say that there is a significant difference between my samples?

Thank you!

graphpad limma adj p p-value • 163 views
ADD COMMENTlink modified 4 weeks ago by dsull420 • written 4 weeks ago by LuisNagano20

I suggest you read the limma papers to understand what it is doing (model fitting etc). In short: It is not doing a simple t-test as dsull already pointed out. Run the entire dataset through limma and take the adjusted p-values. Do not start custom approaches outside of the established packages for RNA-seq if you have no expert knowledge. The introduction of this manual cites the relevant literature: https://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by ATpoint25k
1
gravatar for dsull
4 weeks ago by
dsull420
UCLA
dsull420 wrote:

Yes, you will get different p-values: limma uses a different approach to assessing statistical significance than the ordinary Student's t-test.

And you need to use the adj p-val (to correct for multiple comparisons; since microarrays contains thousands of probes, to determine which ones give statistically significance differences in differential gene expression analysis, you need the adjusted p-vals). If your goal was to just look at the expression of a single gene, then no need to do multiple comparisons correction (but if you're looking at just one gene, you wouldn't be using microarrays anyway).

For making graphs, I'd recommend just plotting log fold changes for each gene rather than plotting two bars: average expression of treatment samples for gene X and average expression of control samples for gene X. But regardless, in either case, use the limma adjusted p-values.

ADD COMMENTlink written 4 weeks ago by dsull420

Thank you very much, dsull! It was a great explanation.

I want to plot a single gene in a bar plot using microarray data, because my resources here are scarce, then make perturbations and qPCR is not possible now. I want to use deposited data to see the expression of some targets of my interest are significantly regulated by gene KO, so if it is more plausible I accept the pvalue or adj-p of the limma, just for a single gene?

ADD REPLYlink written 4 weeks ago by LuisNagano20

If you're testing a single gene and have always only cared about that one gene, you use the p-value. If you're testing multiple genes, you adjust your p-values. This is generally true regardless of the techniques used.

However, keep in mind that using microarray to analyze a single gene isn't ideal because that's not what microarrays were designed for (they were designed to look at thousands of genes). There are a lot of concepts involved here (noise, dynamic range, etc.) but essentially it boils down to: microarrays tell you which genes (out of tens of thousands) are likely to be differentially expressed whereas qPCR is much better at telling you whether a single of gene interest is likely to be differentially expressed.

You're trying to validate a gene of interest using an assay that is fairly crappy for differential gene expression of individual genes. Not sure how much info you can gain from it. What will a p-value (calculated based on assumptions that really aren't exactly true) in this case really tell you? That said, if you have no other options and if you really believe that your gene of interest is differentially expressed and that despite all the limitations of microarray, it has enough power to detect the effect if it exists, then go for it.

For me, I usually end up caring about more genes than just a single gene of interest when I have genome-wide data available (so I practically almost always use FDR control). After all, do I 'really' just want to care about one gene? What if a large number of genes are differentially expressed (most moreso than my gene of interest)? Is my gene of interest really 'special' then? I guess it depends on what your question is but it's something to think about.

ADD REPLYlink written 4 weeks ago by dsull420
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1727 users visited in the last hour