SnpSift set intersection with EFF annotations
1
0
Entering edit mode
3.3 years ago
Richard ▴ 590

Hi all,

I have a set of genes for which I want to extract the VCF files where they are listed as being modified, for example, with a SNPEff impact of "HIGH".

To intersect my list of genes with a VCF file and extract this HIGH impact variants I can use this command:

java SnpSift.jar filter -s gene_list.txt  "(  (EFF[*].IMPACT = 'HIGH') & ANN[*].GENE in SET[0] )" my.vcf

This will extract lines that have EFF annotations for genes in my list that also have an annotation of HIGH impact. However, this can occasionally identify a variant of HIGH impact in a gene not in my list, but on the same line as a variant of a different impact for a gene in my list.

Is there a way to select the lines that contain a HIGH impact variant in one of my genes (ie. not just on a line where A gene has a HIGH impact variant that overlaps with one of my genes) ?

snpsift variant annotation parsing • 1.1k views
ADD COMMENT
0
Entering edit mode
3.3 years ago
Richard ▴ 590

Found this script to split the VCF annotations into 1 per line and using the script above gets me where I need to be:

cat my.vcf | snpEff-4.3/scripts/vcfEffOnePerLine.pl | java -jar snpEff-4.3/SnpSift.jar filter -s gene_list_20200218.txt "(  (EFF[*].IMPACT = 'HIGH') & ANN[*].GENE in SET[0] )"
ADD COMMENT

Login before adding your answer.

Traffic: 1457 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6