Question

p-value in Limma vs Graphpad

1

Entering edit mode

4.5 years ago

LuisNagano ▴ 90

Hello guys,

I have some statistical issue. When I analyze my microarray data using RMA normalization, i get pvalue and adj-p from differential analysis. But when I go to graphpad to make analysis using the same normalized data, the result gives me a very different pvalue. I know that necessarily the limma will give me different values from graphpad, but when I think of bar graph comparing two columns, can i use the limma pvalue to plot the significant statistic or I need to use adj-p to say that there is a significant difference between my samples?

Thank you!

p-value adj p limma graphpad • 1.8k views

ADD COMMENT • link updated 4.5 years ago by dsull ★ 5.8k • written 4.5 years ago by LuisNagano ▴ 90

0

Entering edit mode

I suggest you read the limma papers to understand what it is doing (model fitting etc). In short: It is not doing a simple t-test as dsull already pointed out. Run the entire dataset through limma and take the adjusted p-values. Do not start custom approaches outside of the established packages for RNA-seq if you have no expert knowledge. The introduction of this manual cites the relevant literature: https://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf

ADD REPLY • link 4.5 years ago by ATpoint 82k

score 1 · Answer 1 · 2019-10-16

1

Entering edit mode

4.5 years ago

dsull ★ 5.8k

Yes, you will get different p-values: limma uses a different approach to assessing statistical significance than the ordinary Student's t-test.

And you need to use the adj p-val (to correct for multiple comparisons; since microarrays contains thousands of probes, to determine which ones give statistically significance differences in differential gene expression analysis, you need the adjusted p-vals). If your goal was to just look at the expression of a single gene, then no need to do multiple comparisons correction (but if you're looking at just one gene, you wouldn't be using microarrays anyway).

For making graphs, I'd recommend just plotting log fold changes for each gene rather than plotting two bars: average expression of treatment samples for gene X and average expression of control samples for gene X. But regardless, in either case, use the limma adjusted p-values.

ADD COMMENT • link 4.5 years ago by dsull ★ 5.8k

0

Entering edit mode

Thank you very much, dsull! It was a great explanation.

I want to plot a single gene in a bar plot using microarray data, because my resources here are scarce, then make perturbations and qPCR is not possible now. I want to use deposited data to see the expression of some targets of my interest are significantly regulated by gene KO, so if it is more plausible I accept the pvalue or adj-p of the limma, just for a single gene?

ADD REPLY • link 4.5 years ago by LuisNagano ▴ 90

1

Entering edit mode

If you're testing a single gene and have always only cared about that one gene, you use the p-value. If you're testing multiple genes, you adjust your p-values. This is generally true regardless of the techniques used.

However, keep in mind that using microarray to analyze a single gene isn't ideal because that's not what microarrays were designed for (they were designed to look at thousands of genes). There are a lot of concepts involved here (noise, dynamic range, etc.) but essentially it boils down to: microarrays tell you which genes (out of tens of thousands) are likely to be differentially expressed whereas qPCR is much better at telling you whether a single of gene interest is likely to be differentially expressed.

You're trying to validate a gene of interest using an assay that is fairly crappy for differential gene expression of individual genes. Not sure how much info you can gain from it. What will a p-value (calculated based on assumptions that really aren't exactly true) in this case really tell you? That said, if you have no other options and if you really believe that your gene of interest is differentially expressed and that despite all the limitations of microarray, it has enough power to detect the effect if it exists, then go for it.

For me, I usually end up caring about more genes than just a single gene of interest when I have genome-wide data available (so I practically almost always use FDR control). After all, do I 'really' just want to care about one gene? What if a large number of genes are differentially expressed (most moreso than my gene of interest)? Is my gene of interest really 'special' then? I guess it depends on what your question is but it's something to think about.

ADD REPLY • link 4.5 years ago by dsull ★ 5.8k