Question: How to Calculate FDR in permutation F test
gravatar for hellocita
3.8 years ago by
hellocita20 wrote:

Hi all, I am a little confused about how to calculate FDR after permutation F test.

Assume there is 6000 genes in my data. And for each gene, I perform 1000 permutation F test and got 1000 F value, which includes 1 original F value and 999 permutating F value. And p-value = sum(F > F-original)/1000.

But I am confused how to calculate FDR? I think it should be FDR = False positive gene number/ gene with Permutation p < 0.05 number.

Thank you in advance:)

R • 3.2k views
ADD COMMENTlink modified 3.8 years ago by Jean-Karim Heriche24k • written 3.8 years ago by hellocita20

Hi! Did you find answers for the questions you asked? To my understanding for each gene you have to calculate: perm_p-value= number of p-values<=p-value experimental +1/total number of permiutations+1. So your formula is not correct in this way. To perform FDR correction you should take your raw p-values and adjust them e.g. by means of p.adjust(method='fdr') R base function.

ADD REPLYlink modified 19 months ago • written 19 months ago by Denis200
gravatar for Jean-Karim Heriche
3.8 years ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche24k wrote:

The FDR is the probability of getting a false positive result at a given p-value threshold. It is E[false positive]/E[significant tests]. E[significant tests] is just the number of tests called significant at the chosen threshold. The problem is then to estimate the number of false positives. This is the number of true negatives times the probability of calling one significant, which is the given threshold. So we need to estimate the number of true negatives. For this we can assume that the distribution of p-values for true negatives is uniform, plot a histogram of the observed p-values and find the region where the distribution is flat. The height of this part gives an estimate of the proportion of true negatives. In practice, one finds a value lambda after which the p-value distribution is flat and the proportion of true negatives is the number of p-values greater than lambda divided by 1-lambda times the total number of tests. See Storey, J. D. and R. Tibshirani (2003). “Statistical significance for genome-wide studies.”Proceedings of the National Academy of Sciences 100(16): 9440-9445.
This is related to the q-value which is the minimum FDR of deciding that a particular test is significant. This is probably what you want and is available as the qvalue() function in the qvalue R package.

ADD COMMENTlink written 3.8 years ago by Jean-Karim Heriche24k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1401 users visited in the last hour