Question: Using VEP: Variant Effect Predictor, how do I get the 1000G global Allele Frequency of my alternate alleles specifically?
0
gravatar for anp375
2.6 years ago by
anp375160
anp375160 wrote:

Let's say I have this variant:

1 123141 . A C

My alternate allele at this site is C. The reported global MAF in 1000G is 0.8564. However, this is the allele frequency for A at that site, not C.

Here are the options for running VEP: http://useast.ensembl.org/info/docs/tools/vep/script/vep_options.html These are the VEP options I am using:

--check_existing --check_alleles --gmaf --maf_1kg --maf_esp

======================================================================

The cells given by --maf_1kg seem to give the allele frequency of my alt allele, with these options. But I need the global allele frequency for my alt allele. I'm still trying to figure out how this works. In the process of writing this question, I'm finding new problems that aren't making sense.

For rs200645137 +C insertion, VEP gives me this:

       AFR_MAF|AMR_MAF |EAS_MAF|EUR_MAF |SAS_MAF
              |C:0.4024|C:0.353|C:0.4365|C:0.2843

AFR_MAF is missing.

From dbsnp: https://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=200645137 I get:

       AFR_MAF |AMR_MAF|EAS_MAF |EUR_MAF |SAS_MAF
       C:0.4024|C:0.353|C:0.4365|C:0.2843|C:0.456

So, really, SAS_MAF was missing and all the numbers were misaligned. VEP also gives me a global minor allele frequency of:

GMAF: -:0.4837

which is also in dbsnp.

Annovar gives me 1-that, or 0.516291, because I have the C present. But even that does not make sense to me. But when I calculate the allele frequency myself, I get this:

1008*0.4365+1006*0.2843+1322*0.4024+694*0.353+978*0.456==1948.921
1008+1006+1322+694+978==5008
1948.921/5008==0.3891615

So the 1000Genomes AF for my alt allele is 0.3891615.

What is going on!? I'm definitely using 1000G phase 3 data.

Edit: Okay, the global allele frequencies don't match because it's on the X-chromosome and not everyone has two of those. But the VEP columns still don't match up. The SAS_MAF is being shoved into the next column, and AFR_MAF is either empty, or is filled with a number I can't find anywhere else. The ESP columns for AA_MAF and EA_MAF don't seem to have the correct numbers either.

ADD COMMENTlink modified 2.4 years ago by EnsemblWill560 • written 2.6 years ago by anp375160
0
gravatar for EnsemblWill
2.4 years ago by
EnsemblWill560
United Kingdom
EnsemblWill560 wrote:

So, really, SAS_MAF was missing and all the numbers were misaligned

This issue was noticed by a few users who had the version 85 GRCh37 VEP cache. AFAIK the issue is not present in the version 86 or 87 cache files, so if you are able to update then please use the newer version (you could also use the 84 cache if you are tied to v85 of the software for any reason, the data should be identical).

VEP also gives me a global minor allele frequency of: GMAF: -:0.4837

This is a running issue with the way VEP reports allele frequencies; VEP reports only the non-reference allele(s) and frequency, not necessarily the frequency of the allele as input.

It has been resolved in the new rewrite of VEP available in beta from https://github.com/Ensembl/ensembl-vep

ADD COMMENTlink written 2.4 years ago by EnsemblWill560
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1334 users visited in the last hour