How to filter gnomADe_AF < 0.001 with unix?
2
1
Entering edit mode
2.8 years ago
HL ▴ 10

I would need to filter VEP annotated vcf file and find all variants where gnomADe_AF < 0.001. bcftool query -I 'gnomADe_AF<0.001' -f '%CHROM %INFO/gnomADe_AF\n' file.vcf.annotated does not return anything or even give any errors.

If I put bcftool query -I 'AF<1' -f '%CHROM %INFO/AF\n' file.vcf.annotated the command works and I get all variants with AF<1. And in the headers I checked that there is gnomAD_AF and AF in the INFO tags.

Could anyone help with how to filter this with unix?

EDIT: I can see the values for gnomADe_AF when looking with less, but for some reason I can't get them with any commands separately.

gnomADe_AF filter VEP • 1.7k views
ADD COMMENT
0
Entering edit mode

In order to filter a VCF by a particular INFO tag, it must be described in the VCF file header and you must query exactly as it is written both in the header and in the INFO column. Maybe you're trying to filter gnomAD_AF but your writting gnomADe_AF instead.

ADD REPLY
0
Entering edit mode

Yes, I have tried with both and also checked many times that it is written exactly the same as in headers and INFO column.

ADD REPLY
0
Entering edit mode
2.8 years ago
Emily 23k

Make sure you're running VEP with --af_gnomad. The fields in your info column are listed here: gnomADe_AF is not one of them. You can use the VEP filter script with your results to filter by any fields.

ADD COMMENT
0
Entering edit mode

There is this header and INFO columns

##INFO=<ID=gnomADe_AF,Number=.,Type=String,Description="AF field from /home/...

##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: Allele|Consequence|IMPACT|SYMBOL|Gene|....|gnomADe_Hom|gnomADe_Hemi|gnomADe_AF|gnomADe_AF_AMR|

So I think that it really should be there. I have also tried with gnomAD_AF and it didn't work. And the tag is added with --custom

ADD REPLY
0
Entering edit mode
2.8 years ago
sbstevenlee ▴ 480

Check out the vcf-vep command from the fuc package I wrote:

If you want to also exclude variants that do not have AF information:

$ fuc vcf-vep in.vcf "gnomAD_AF < 0.001" > out.vcf

If you want to treat variants without AF information as having AF of zero:

$ fuc vcf-vep in.vcf "gnomAD_AF < 0.001" --as_zero > out.vcf

For more help:

$ fuc vcf-vep -h
usage: fuc vcf-vep [-h] [--opposite] [--as_zero] vcf expr

This command will filter a VCF file annotated by Ensemble VEP.

Usage examples:
  $ fuc vcf-vep in.vcf "SYMBOL == 'TP53'" > out.vcf
  $ fuc vcf-vep in.vcf "SYMBOL != 'TP53'" > out.vcf
  $ fuc vcf-vep in.vcf "SYMBOL == 'TP53'" --opposite > out.vcf
  $ fuc vcf-vep in.vcf "Consequence in ['splice_donor_variant', 'stop_gained']" > out.vcf
  $ fuc vcf-vep in.vcf "(SYMBOL == 'TP53') and (Consequence.str.contains('stop_gained'))" > out.vcf
  $ fuc vcf-vep in.vcf "gnomAD_AF < 0.001" > out.vcf
  $ fuc vcf-vep in.vcf "gnomAD_AF < 0.001" --as_zero > out.vcf

Positional arguments:
  vcf         VCF file annotated by Ensemble VEP.
  expr        Query expression to evaluate.

Optional arguments:
  -h, --help  Show this help message and exit.
  --opposite  Use this flag to return only records that don't meet the said criteria.
  --as_zero   Use this flag to treat missing values as zero instead of NaN.
ADD COMMENT
0
Entering edit mode

Thanks, this looks useful but unfortunately I'm not able to download some packages for safety reasons so I should just try to manage with basic bcftools or vcftools.

ADD REPLY

Login before adding your answer.

Traffic: 2734 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6