Question: How to Calculate FDR in permutation F test
1
hellocita20 wrote:

Hi all, I am a little confused about how to calculate FDR after permutation F test.

Assume there is 6000 genes in my data. And for each gene, I perform 1000 permutation F test and got 1000 F value, which includes 1 original F value and 999 permutating F value. And p-value = sum(F > F-original)/1000.

But I am confused how to calculate FDR? I think it should be FDR = False positive gene number/ gene with Permutation p < 0.05 number.

R • 3.2k views
modified 3.8 years ago by Jean-Karim Heriche24k • written 3.8 years ago by hellocita20

Hi! Did you find answers for the questions you asked? To my understanding for each gene you have to calculate: perm_p-value= number of p-values<=p-value experimental +1/total number of permiutations+1. So your formula is not correct in this way. To perform `FDR` correction you should take your raw `p-values` and adjust them e.g. by means of `p.adjust(method='fdr')` `R` base function.

0
Jean-Karim Heriche24k wrote:

The FDR is the probability of getting a false positive result at a given p-value threshold. It is E[false positive]/E[significant tests]. E[significant tests] is just the number of tests called significant at the chosen threshold. The problem is then to estimate the number of false positives. This is the number of true negatives times the probability of calling one significant, which is the given threshold. So we need to estimate the number of true negatives. For this we can assume that the distribution of p-values for true negatives is uniform, plot a histogram of the observed p-values and find the region where the distribution is flat. The height of this part gives an estimate of the proportion of true negatives. In practice, one finds a value lambda after which the p-value distribution is flat and the proportion of true negatives is the number of p-values greater than lambda divided by 1-lambda times the total number of tests. See Storey, J. D. and R. Tibshirani (2003). “Statistical significance for genome-wide studies.”Proceedings of the National Academy of Sciences 100(16): 9440-9445.
This is related to the q-value which is the minimum FDR of deciding that a particular test is significant. This is probably what you want and is available as the qvalue() function in the qvalue R package.