how can I get MAF stratified by subpopulation for ASW, or CEU and YRI
1
2
Entering edit mode
6.8 years ago
Vincent Laufer ★ 1.4k

Hello,

I would like to obtain allele frequency by subpopulation for 102 SNPs.

The particular subpopulations I am interested in are ASW (African Americans from the South West), YRI, and CEU.

I would like to be able to batch submit them, if possible; I know I can do this through dbsnp, but sometimes the number of individuals used to assign the MAF gives me pause. The list of SNPs is here...

rs187786174
rs227163
rs2301888
rs28411352
rs12140275
rs2476601
rs624988
rs2228145
rs2317230
rs4656942
rs72717009
rs75409195
rs2105325
rs17668708
rs10175798
rs34695944
rs13385025
rs1858037
rs9653442

MAF allele frequency subpopulation genetics • 3.0k views
0
Entering edit mode

I'd suggest not pasting the entire list here, because it seems to convey the idea that you want us to do the work for you. If you wish to point out specific cases, please do that instead.

0
Entering edit mode

Also, please elaborate what you mean by "number of individuals gives you pause"

0
Entering edit mode

at this URL you can see an example of what I mean:

http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=227163
this SNP has been around a while, so the "population diversity" table is quite large. For other SNPs, it is much smaller. In that case, the sample chromosome count is often only 2 or 8 or 10 ... I'd rather be pulling from somewhere that has at least 100 chromosomes for a given ethnicity, if that is not possible of course I will take what I can get.

0
Entering edit mode

Well then you're left with two options: dbSNP batch query or a really REALLY long procedure that will give you a table of variant, pop[1..N]_alleleFreq, pop[1..N]_alleleCount

Unless you're looking at frequent queries of this kind, the second approach is not worth the effort.

EDIT: VEP is definitely a viable option. It is your best option, I think.

4
Entering edit mode
6.8 years ago

dbSNP rs IDs can be fed directly into Ensembl's VEP annotator that reports population specific minor allele frequencies from 1000genomes and NHLBI ESP: http://www.ensembl.org/Homo_sapiens/Tools/VEP

Here's the subset of information it returns that you'll find relevant:

GMAF - minor allele and frequency of existing variation in 1000 Genomes Phase 1
AFR_MAF, AMR_MAF, ASN_MAF, EUR_MAF - Same as above, for African, American, Asian, and European populations, respectively
AA_MAF, EA_MAF - minor allele and frequency of existing variant in NHLBI-ESP African American and European American populations, respectively

But you listed some like chr19:10771941 and chr21:35928240 that are just genomic loci, and don't specify the reference and variant alleles. Click here for the various input formats that VEP supports.

1
Entering edit mode

Thank you, this is perfect.