I've successfully annotated a VCF file using snpEff but would like to filter the resulting VCF file using snpSift.
The VCF file is in the following format:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1 Sample2 Sample3
A 8725 . C T . PASS ADP=99;WT=2;HET=1;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/0:145:77:77:77:0:0%:1E0:37:0:41:36:0:0 0/1:0:58:58:55:3:5.17%:9.8E-1:36:33:25:30:3:0 0/0:285:163:162:161:1:0.62%:9.8E-1:36:39:78:83:1:0
I can filter the main columns e.g. INFO column but I don't know how to filter using the "Sample1 Sample2 Sample3" columns by, for instance, only taking vcf entries where one sample has a minimum FREQ of 5% and the other two samples less than 2%.
So far I have this but would like to add the other expressions I detailed above, any ideas how I can do this?
cat A.vcf | java -jar SnpSift.jar filter " ((NC = 0) & ( REF = 'C' ) & ( ALT = 'T')) " > A_filtered.vcf
Thanks but doesn't work, doesn't recognise it as a field or Sample1. I tried adding the Sample1 to the vcf header so I could at least search for a string but still does not recognise it. I thought the program would look at the header to determine what fields are available. May have to just stick with a perl script but was hoping I could do this with existing tools and make it simplier for others.
What if instead of
FREQ[1]
you will tryGEN[0].FREQ>5
and so on? For me it works when I use it forDP
inINFO
field of my file, I do not have aFREQ
field unfortunately. I tried with your VCF example, but it throws an exception thatGenotype numer '0' does not exists
.