Question: Allele frequency comparison by Fisher's exact test
0
gravatar for seta
10 months ago by
seta1.2k
Sweden
seta1.2k wrote:

Dear friends,

I would like to compare the allele frequency of a list containing 3000 SNPs related to a given trait between my population and 1000 genome populations (super populations). For my population, the SNPs derived from whole genome sequencing of about 1000 individuals. But, due to missing genotypes and some quality control done at previous steps, the number of individuals (so the total allele counts) is not the same at all SNP positions, however as I checked, it (the total allele count) is the same at all positions for 1000 genome population. As I read, the fisher’s exact test can be used to compare allele count among the population for finding SNPs with significantly different allele frequency, yes? I also aware of the simple Fisher's test for one SNP in R, but I don’t know how I can do it for a list of 3000 SNPs and get the adjusted p-value for each SNP. Could you please help me out on this issue?

P.S. As I’m a basically biologist, please kindly suggest/advise me any point that should be considered for doing the analysis.

Many thanks

ADD COMMENTlink modified 10 months ago • written 10 months ago by seta1.2k
1
gravatar for egeulgen
10 months ago by
egeulgen820
Istanbul
egeulgen820 wrote:

You may perform multiple Fisher's exact test, store the p values and then adjust them. In R, you would do something along the lines of:

### Perform Fisher's exact and store the p-values
p_vals <- c()
for (i in 1:N_SNPS) {
    result <-  fisher.test(...)
    p_vals <- c(p_vals, result$p.value)
}
### Adjust the p p-values
p_adjusted <- p.adjust(p_vals, method = "YOUR_FAVORITE_METHOD")

Hope this helps

ADD COMMENTlink written 10 months ago by egeulgen820

Thank you for the response. Could you please kindly tell me how I should run multiple, here 3000, Fisher's exact test and store the related p-value for each SNP?

ADD REPLYlink modified 10 months ago • written 10 months ago by seta1.2k

I provided the example R code in my response (see above). There N_SNP is replaced with 3000 and for fisher.test you have to do something like:

fisher.test(table(my_genotypes[i, ], 1kg_genotypes[i, ]))

This does depend on how you store the two.

ADD REPLYlink written 10 months ago by egeulgen820

Thanks for your help. I have both, the ref and alt allele count for each SNP from both population, however, I'm not sure how to set and feed them to R for multiple fisher's exact tests and get the p-value for each one. Could please someone help me more on this issue?

ADD REPLYlink written 10 months ago by seta1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1955 users visited in the last hour