Question: How to extract total allele frequency from gnomAD annotation
0
gravatar for yasminsoareslima
6 months ago by
yasminsoareslima10 wrote:

Hello people,

I am quite confused about to extract the information I want from the Genomad annotation database (downloaded .vcf ). I've download the .vcf from Genomad and annotated using SnPEff, which resulted in maaany informations, for instance, the allele frequencies from each population, from controls, non_cancer, topmed, etc.

However, what interests me is the total allele frequency, the same as it appears in the browser version. The only annotation I could see that approximates from this is the "AF_raw", but when I compared to the browser version, I saw it was not the same for some variants. Could anyone tell me which INFO could I extract from my vcf to obtain this information?

Thanks!

gnmoad af allelefrequency • 885 views
ADD COMMENTlink modified 4 months ago by lecob0 • written 6 months ago by yasminsoareslima10

GenomAD

Do you mean gnomAD? Please give us a couple of example loci where you see a difference between the browser and your VCF. For those loci, also paste the entire INFO values here so we can see what's going on.

ADD REPLYlink modified 6 months ago • written 6 months ago by RamRS23k

Can I ask you why did you have to annotate the vcf file? I have the vcf file from the genomes of gnomAD and they already contain the information you mentioned ("the allele frequencies from each population, from controls, non_cancer, topmed, etc.").

Maybe it was an older version and when I downloaded the new one? (I downloaded the data the first week of March)

On top of the INFO rows I can see for example:

##INFO=<ID=AC,Number=A,Type=Integer,Description="Alternate allele count for samples">
##INFO=<ID=AN,Number=A,Type=Integer,Description="Total number of alleles in samples">
##INFO=<ID=AF,Number=A,Type=Float,Description="Alternate allele frequency in samples">

Is the last one the one you are interested in?

ADD REPLYlink written 4 months ago by lecob0

I think it's a gap in understanding - the OP meant that they downloaded the gnomAD VCF and used it to annotate their own VCF.

Also, this post should not be an answer. It's not answering the top level question; it's a request for clarification and hence should be a comment. I'm moving it to a comment now, but please be more careful in the future.

ADD REPLYlink written 4 months ago by RamRS23k

Hi Lecob, it is exactly what RamRS has said. I donwloaded the gnomAD VCF in order to use it to annotate MY vcf, with my data. And if you pay attention, this AF doesn't match with the AF described in the browser. The one would match with the browser it usually the adjusted one, i. e. AF_adj, but I couldn't find it. So, not sure I'm right.

ADD REPLYlink written 4 months ago by yasminsoareslima10
2

You should really just post your SnpEff command if you're having trouble annotating your vcf file, since we know the value you want shoud be in the gnomAD vcf under AF. But you're saying your gnomAD file has the raw value under AF instead?

Did you get your gnomAD file directly from gnomAD or some other source? I've never seen a gnomAD field named AF_adj, rather it should be AF_raw and AF so to me this suggests you're not using an official gnomAD file. Try downloading the file directly from gnomAD and using that instead, or post a few sample lines of your gnomAD file so we can see how the INFO is denoted.

ADD REPLYlink modified 4 months ago • written 4 months ago by manuel.belmadani1.1k

Hi Manuel,

Thank you for trying to help. yes, I downloaded my gnomAD file from gnomAD website. I haven't seen AF_adj for gnomAD either, it was just what I was expecting to find, as there is a field alike for ExAC for example.

Thanks your comments I found the AF field and managed to annotate it. However, comparing both AF_raw and AF fields to what it says in the browser version, it is still different. There is no match. Of course the values are very similar, yet not the same.

I suppose I download the last version: gnomad.exomes.r2.1.sites.vcf.bgz

ADD REPLYlink written 4 months ago by yasminsoareslima10
0
gravatar for manuel.belmadani
6 months ago by
Canada
manuel.belmadani1.1k wrote:

In the gnomAD vcf, the allele frequency should be the "AF=XXXXXX", for example "AF=3.28019e-05" for rs1411594174. The AF_raw field for this variant is indeed slightly different.

There's also subsets and population specific frequencies, like "AF_eas_female".

ADD COMMENTlink written 6 months ago by manuel.belmadani1.1k

yeah, but it seems I dont have the information "AF" in my vcf, that's why I'm wondering how it is being called or how it is annotated. Perhaps it comes with a different name, I don't know.

ADD REPLYlink written 6 months ago by yasminsoareslima10

I'm not sure then, seems like a SnpEff specific issue. Try looking up for examples where SnpEff was used with gnomAD data, or post the steps required (every command line and file obtained) to reproduce your problem. An example SNP/variant where you see the AF_raw field but not the AF_ALL one could be useful too (post the entire line with annotations).

Or you could also use annovar. I often annotated VCFs using Annovar and the annotation for gnomad AF was named gnomAD_genome_ALL (there's also exome ones, that's somewhat specific to the way the annovar database was built).

ADD REPLYlink modified 6 months ago • written 6 months ago by manuel.belmadani1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 645 users visited in the last hour