qvalue problem in mutation dataset
1
0
Entering edit mode
8.7 years ago
PH_2015 • 0

Hi all,

I am new member to this community and have a question regarding q value calculation. I have a dataset of mutations and using the mutation dataset I tried to find out functionally significant mutations in genes (CDS). For each gene I have a p value. Some of them have p val <=0.05. the lowest in my dataset is 0.01269. Whenever I do multiple hypothesis correction using BH method I get very high qvalues something like 0.4888820. According to which 48% of the test found to be statistically significant as False positive. This is a huge number. I got p-value from wilcoxon rank sum test. Now the genes which are coming as significant are some of the known cancerous genes or significantly mutated genes. My question is can you guys suggest me how to deal with such a problem. This simply states that my results are not statistically significant.

Many thanks,
PH

SNP • 1.8k views
ADD COMMENT
0
Entering edit mode

There is no way to deal with that. After you correct for multiple testing your pvalues are not significant. However, you might find some way to validate the results by showing that the genes with uncorrected pvalue <=0.05 ahave some functional relevance. Can you validate some of them? This would be the best option.

ADD REPLY
0
Entering edit mode

The idea is to find some driver mutations from the analysis and validate them. Another validation of the test is to see if known cancerous genes are showing up, which is what I am finding in my results. Its q-value/adjusted p-value which is coming so high.

ADD REPLY
0
Entering edit mode
8.7 years ago
Steven Lakin ★ 1.8k

Statistical testing, significance, and multiple testing correction have no bearing on whether or not what you are testing has an actual biological impact. Statistics is meant to provide mathematical support for your results, but in no way does a significant test statistic or error rate for sure identify your data as a true positive or true negative. So even if you recalculate your values using a different statistic so that they are significant, it won't mean anything until you validate them anyway.

It may be that your experimental design wasn't robust enough to get significant results post-correction, possibly because many multiple-test correction measures are very conservative. These corrections are helpful if you have many significant results in your uncorrected data that you need to narrow down in order to validate with further experiments.

However, in your case, since your lowest p-value before correction is already fairly high, you could either 1. take the most significant of your initial results and run further experiments to validate them or retest them in some way, or 2. redesign your experiment to result in higher power for detection of significant results.

See this article for a relevant review of multiple testing measures.

It sounds like you're going to do further validation regardless, and if that's the case and you don't want to redesign your initial experiment, you could just take your most significant results and see what results are generated by the validation. However, if you don't plan on doing any further testing of significant results, then the data you currently have can't really tell you much.

ADD COMMENT

Login before adding your answer.

Traffic: 2414 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6