Question: The same qvalues are reported to different pvalues
gravatar for
2.1 years ago by
carloscardoso.bio0 wrote:

Hi guys,

I'm running MethylKit to find differently methylated CpG. I have three replicates in the control and three in the treatment. When I run the code myDiff=calculateDiffMeth(meth,mc.cores=4,method="fdr") and then hyper=getMethylDiff(myDiff,difference=10,qvalue=0.01,type="hyper") the output give me qvalues lower than pvalues. Some of these pvalues are not even statistically significant. The result is the same changing the adjustment method (SLIM/BH)

> hyper
methylDiff object with 12912 rows
chr      start        end      strand   pvalue      qvalue        meth.diff
chr1.1 226898 226898      + 0.02917285 0.001714451  20.02165
chr1.1 227164 227164      + 0.43722433 0.001714451  11.76471
chr1.1 227258 227258      + 0.07721846 0.001714451  30.04769
chr1.1 273666 273666      + 0.08245243 0.001714451  15.38462
chr1.1 293303 293303      + 0.04406780 0.001714451  15.38462
chr1.1 297572 297572      + 0.33056133 0.001714451  11.02757

How is it possible to have qvalues higher than pvalues with so many multiple comparisons (~8 millions )? What should I do to overcome this problem and find reliable results (p-value lower than 0.05 and q-value lower than 0.01).

next-gen R • 564 views
ADD COMMENTlink modified 2.1 years ago by Istvan Albert ♦♦ 86k • written 2.1 years ago by carloscardoso.bio0

You may try to use the standard p.adjust with method = "BH" (which is the FDR-correction) on the hyper's pvalue.

It may be that the parallel computing messed up the qvalues.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by michael.ante3.6k
gravatar for Istvan Albert
2.1 years ago by
Istvan Albert ♦♦ 86k
University Park, USA
Istvan Albert ♦♦ 86k wrote:

To be honest your question is confusing, you mention

...the output give me qvalues lower than pvalues ...

but then

... how is it possible to have qvalues higher than pvalues ...

is your question that the values are higher or lower? Unclear...

That being said, there is no reason to select both by p-value and q-value at the same time. You would be messing up the statistical interpretation of the results.

Then, as it happens q-values are less well defined than p-values, different tools may compute different quantities that they call q-values, so look into the documentation. In the example that you show the q-value is larger than 1, that also goes against what a typical q-value should be in the range of [0,1]

Finally, I will say q-values are not p-values, while one might expect q-values to be lower than p-values and that is how they turn out most of the time, they are different concepts altogether.

ADD COMMENTlink written 2.1 years ago by Istvan Albert ♦♦ 86k

Hi Istvan

Thank you for your answer and sorry for my ambiguous question. The q-values are lower than p-values. The q-values are in the column of values = 0.001714451. The values above 1 are from the column meth.diff.

The reason for using both, p-values and q-values is because the q-values were not computed right. I checked the histogram of p-values and most of the p-values are equal or close to one. Maybe this lake of uniformity is affecting the FDR correction. Would you have any suggestion to correct p-values for multiple comparisons?

ADD REPLYlink written 2.1 years ago by carloscardoso.bio0

frankly, if the q-values are computed incorrectly the entire pipeline is suspect in my opinion - are the p-values to be trusted then? hard to say.

You can estimate the adjusted p-value as a Bonferroni correction where you simply divide the threshold with the number of comparisons that you make.

So if initially, you wanted to apply a 0.05 threshold then, if you had 10 comparisons (rows in the table) the threshold is 0.005

ADD REPLYlink written 2.1 years ago by Istvan Albert ♦♦ 86k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1054 users visited in the last hour