p value for permutation test for gene-interaction data
1
0
Entering edit mode
4.1 years ago
pixie@bioinfo ★ 1.5k

Hello, I have a data for pair-wise interactions between genes. I have taken the top 1000 interactions (ordered based on some value) and found that X no. of interactions in which Gene_1 is hypo methylated and Gene_2 is Up-regulated.

My objective is to prove that the X no. of interactions (in the ordered set) is significantly higher than a random set of interactions. For this, I randomized the interactions 1000 times and calculated the interactions for which Gene1 is hypo methylated and Gene2 is Up-regulated. My question is how to get a p-value for this permutation? Kindly help. Thanks.

statistics • 1.2k views
1
Entering edit mode
4.1 years ago

Let say you have X = number of interaction between Gene_1 (hypometh) and Gene_2 (up-reg) . After your permutation (N=1000) you will have 1000 values of simulated X_sim = number of interaction between Gene_1 (hypometh) and Gene_2 (up-reg)

In order to compute the p-value you compute the number of values of X_sim higher or equal than X , divided by the number of permutation (N=1000)

In R code (for the example X=20 )

set.seed(123) # for reproducibility
X <- 20
# generate X_sim based on a gaussian m=10 ; sd = 5
N <- 1000 # number of permutations
X_sim <- rnorm(n=N, mean = 10, sd = 5)
# plot the X_sim distribution and put X on the plot
plot(density(X_sim))
abline(v=20,col="red")
# compute p-value
p <- sum(X_sim >= X) / N
# p = 0.028


Thus p-value will be more "significant" if the X value goes towards the end of the X_sim distribution tail.

0
Entering edit mode

Thats great, Merci beaucoup!

0
Entering edit mode

Although it usually will not make much difference, I would add that there is an argument for estimating the permutation p-value adding 1 to the numerator and denominator (so that p-val = n + 1 / m + 1 where n is your number of hits higher or equal to your threshold and m is the number of permutations) (see this reference)