Hi all. I am trying to replicate the results of another paper.
The aim: I have a P value. Then I randomised the data set 1,000 times, and now I have another set of randomised P Values. As an example, let's say I have the p value below, and I've randomised the data set 10 times and I have 10 randomised p values:
Actual P Value = 0.03 Random P Values: 0.06 0.21 0.003 0.21 0.017 0.21 0.48 0.21 0.48 0.21
The exact technique that I want to do is: "Calculate the false discovery rate (FDR) (Q), defined as the number of expected false positives over the number of significant results (Storey and Tibshirani, 2003)".
I wrote this code using python, rpy2 and R:
import sys import rpy2 import rpy2.robjects as ro from rpy2.robjects import r from rpy2.robjects.packages import importr devtools = importr("devtools") qvalue = importr("qvalue") list1 = [float(line.strip()) for line in open(sys.argv)] #This is a list of 1,000 floats (randomised P Values) pvals = ro.FloatVector(list1) rcode = 'qobj <-qvalue(p=%s)' %(pvals.r_repr()) res = ro.r(rcode) r_output1 = 'qobj$pvalue' r_output2 = 'qobj$qvalue' r_pvalue = ro.r(r_output1) r_qvalue = ro.r(r_output2) ListOfValues = zip(r_pvalue,r_qvalue) for i in ListOfValues: print str(i) + "\t" + str(i)
The output looks like this (a snapshot of my output for 1,000 p values):
P value Q value 0.21 0.032 0.48 0.04 0.21 0.032 0.81 0.068 0.017 0.017 0.48 0.047 0.81 0.06 0.017 0.017 0.211 0.03 0.81 0.068 0.069 0.023 0.069 0.023 0.069 0.023 0.48 0.047
I do not want an individual Q value for every randomised P value. I have an original P value (in this case 0.03), and I want one Q value to be associated with this P value; based on the set of randomised P Values. Could someone tell me where I've gone wrong in this code; or how I extract this information?