Question: hypergeometric test on outlier SNPs
1
gravatar for Ana
12 months ago by
Ana170
Ana170 wrote:

Hi all, I have done some genome scan analyese with 2 different methods to identify outlier SNPs. There are some overlapping between these 2 methods. I want to know if the observed overlap between these 2 methods is any better than that obtained by chance alone? I have read different pots(https://stats.stackexchange.com/questions/16247/calculating-the-probability-of-gene-list-overlap-between-an-rna-seq-and-a-chip-c or https://www.biostars.org/p/90662/), but I am just getting a bit confused.

The total number of SNPs = 2,000,000,
total number of outlierSNPs discovered by method 1 =7889
total number of outlier SNPs discovered by method 2 =46340
overlapping between methods 1 and 2 outliers = 4567

I am using the "hyper" function in R, but I just do not understand how to specific hyper parameters

phyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE)

first question, n is total number of SNPs - m or it should be total number of outlier SNPs outliers -m? how can I replace these parameters with actual values? Should it be like

phyper(4567-1, 46340,2,000,000-46340, 7889, lower.tail = TRUE, log.p = FALSE)

then I get 1, this means the overlapping observed is totally by chance! I would appreciate if anyone could help me to resolve my problem.

ADD COMMENTlink modified 12 months ago • written 12 months ago by Ana170
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1657 users visited in the last hour