hypergeometric test on outlier SNPs
0
1
Entering edit mode
5.8 years ago
Ana ▴ 200

Hi all, I have done some genome scan analyese with 2 different methods to identify outlier SNPs. There are some overlapping between these 2 methods. I want to know if the observed overlap between these 2 methods is any better than that obtained by chance alone? I have read different pots(https://stats.stackexchange.com/questions/16247/calculating-the-probability-of-gene-list-overlap-between-an-rna-seq-and-a-chip-c or https://www.biostars.org/p/90662/), but I am just getting a bit confused.

The total number of SNPs = 2,000,000,
total number of outlierSNPs discovered by method 1 =7889
total number of outlier SNPs discovered by method 2 =46340
overlapping between methods 1 and 2 outliers = 4567

I am using the "hyper" function in R, but I just do not understand how to specific hyper parameters

phyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE)

first question, n is total number of SNPs - m or it should be total number of outlier SNPs outliers -m? how can I replace these parameters with actual values? Should it be like

phyper(4567-1, 46340,2,000,000-46340, 7889, lower.tail = TRUE, log.p = FALSE)

then I get 1, this means the overlapping observed is totally by chance! I would appreciate if anyone could help me to resolve my problem.

outliers hypergeometric test R • 1.4k views
ADD COMMENT

Login before adding your answer.

Traffic: 2118 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6