Question: Volcano plot: why is there big FC with big p-values?
0
gravatar for i.am.filippov
5 weeks ago by
i.am.filippov0 wrote:

I'm looking at tutorial about analysing differential expression from microarray data. limma is used to detect differentially expressed genes.

Now if you look at:

here

the bigger log fold change corresponds to smaller p-value, i.e. bigger FC is more significant. But why would different genes at the same FC level have different p-values? How is the big spread explained? Does this question make sense?

Thanks!

R gene • 204 views
ADD COMMENTlink modified 5 weeks ago by Timze W40 • written 5 weeks ago by i.am.filippov0
1

Two simple explanations are the larger within-treatment variances (e.g. counts for four treatment 1 samples are 2,2,2,2; and counts for four treatment 2 samples are 8,0,0,0), or differences in counts (e.g. 1/2 or 100/200).

ADD REPLYlink written 5 weeks ago by h.mon28k
8
gravatar for ATpoint
5 weeks ago by
ATpoint26k
Germany
ATpoint26k wrote:

The smaller the counts of a gene (or whatever you measure) are, the more unreliable they are and the more prone these counts are to show large fold changes.

Lets have an example:

A gene had 10 counts in sampleA and 2 counts in sampleB. Makes a fold change of 5 right? Say another gene had 1000 counts in A and 200 in B, also FC = 5. Which is more reliable: I would say the second one. Imagine you have small fluctuations of the counts because of the inherent uncertainly / error rate of sequencing and the quantification method. Say the gene now had only 5 counts in A and 4 in B, FC is now 1.25 instead of 5. If the second gene had the same fluctuation so 995 in A and 202 in B, the FC is now 4,925742574257426, so still very close to 5. The high counts are more resistent to little fluctuations. => If the mean (so the average counts for the genes) is low, the fold changes are high (but unreliable). As far as I know this holds true for every kind of experiment in which quantities are measured.

Long story short: Low counts tend to show artificially high (and often false) fold changes, therefore the confidence in them is low and therefore p-values tend to be large. You would need more replicates to have the power to detect differential genes with low counts compared to genes with high counts. That is why statistical power is inherently greater for highly-expressed than lowly-expressed genes.

ADD COMMENTlink modified 18 days ago • written 5 weeks ago by ATpoint26k
1
gravatar for Timze W
5 weeks ago by
Timze W40
Timze W40 wrote:

It seems you have the idea that bigger Fold-change expect to smaller p-value.
But P-value and Fold-change are not necessarily related, fold change just reflects mean change, then P-value is not only depended by mean but also variance. (for example, if you perform the two sample students t-test )

ADD COMMENTlink written 5 weeks ago by Timze W40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1398 users visited in the last hour