I have a VCF file generated with GATK and annotated with SNPeff. There is an annotation field described as
> ##INFO=<ID=ANN,Number=.,Type=String,Description="Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO'"
An example of a variant:
AC=5;AF=0.200;AN=10;ANN=C|intron_variant|MODIFIER|GAPDH|4214|transcript|GAPDH_transcript|sim4cc|n.1964-49T>C||||||
I have been using GATK SelectVariants to generate variant lists based on sample genotype and quality metrics but would also like to isolate all variants with different Annotation Impact. In the example above, I would want to get all variants described as "MODIFIER" I am not getting the format correct for the JEXL commands. Can someone suggest a command?
Thank you, S
Just curious--do you want to produce a valid VCF as output, or just find the records of interest?