Question: Allele frequency comparison by Fisher's exact test
0
gravatar for seta
5 months ago by
seta1.1k
Sweden
seta1.1k wrote:

Dear friends,

I would like to compare the allele frequency of a list containing 3000 SNPs related to a given trait between my population and 1000 genome populations (super populations). For my population, the SNPs derived from whole genome sequencing of about 1000 individuals. But, due to missing genotypes and some quality control done at previous steps, the number of individuals (so the total allele counts) is not the same at all SNP positions, however as I checked, it (the total allele count) is the same at all positions for 1000 genome population. As I read, the fisher’s exact test can be used to compare allele count among the population for finding SNPs with significantly different allele frequency, yes? I also aware of the simple Fisher's test for one SNP in R, but I don’t know how I can do it for a list of 3000 SNPs and get the adjusted p-value for each SNP. Could you please help me out on this issue?

P.S. As I’m a basically biologist, please kindly suggest/advise me any point that should be considered for doing the analysis.

Many thanks

ADD COMMENTlink modified 5 months ago • written 5 months ago by seta1.1k
1
gravatar for egeulgen
5 months ago by
egeulgen710
Istanbul
egeulgen710 wrote:

You may perform multiple Fisher's exact test, store the p values and then adjust them. In R, you would do something along the lines of:

### Perform Fisher's exact and store the p-values
p_vals <- c()
for (i in 1:N_SNPS) {
    result <-  fisher.test(...)
    p_vals <- c(p_vals, result$p.value)
}
### Adjust the p p-values
p_adjusted <- p.adjust(p_vals, method = "YOUR_FAVORITE_METHOD")

Hope this helps

ADD COMMENTlink written 5 months ago by egeulgen710

Thank you for the response. Could you please kindly tell me how I should run multiple, here 3000, Fisher's exact test and store the related p-value for each SNP?

ADD REPLYlink modified 5 months ago • written 5 months ago by seta1.1k

I provided the example R code in my response (see above). There N_SNP is replaced with 3000 and for fisher.test you have to do something like:

fisher.test(table(my_genotypes[i, ], 1kg_genotypes[i, ]))

This does depend on how you store the two.

ADD REPLYlink written 5 months ago by egeulgen710

Thanks for your help. I have both, the ref and alt allele count for each SNP from both population, however, I'm not sure how to set and feed them to R for multiple fisher's exact tests and get the p-value for each one. Could please someone help me more on this issue?

ADD REPLYlink written 5 months ago by seta1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1199 users visited in the last hour