Question: Annotate VCF on specific transcripts (ANN field)
gravatar for FGV
19 months ago by
FGV110 wrote:

Dear all,

I've been using snpEff and snpSift for a while to annotate and filter my VCF files. Usually I just annotate my VCF with snpEff but, as it is, snpEff annotates all transcripts. Recently I wanted to restrict my annotations (ANN) to a list of ~150 transcripts, but could not find a way to do it. Is it possible? As a workaround, I just re-annotated the VCF with snpEff the "-onlyTr" option, but it is annoying to have to annotate a VCF twice. Is there a way to just filter the annotations?

However, even with the "-onlyTr" option, I still get some annotations that should have been removed. In the example below, it annotates a variant on both MSH6 and FBXO11 genes, even though I only specified the transcript ENST00000234420. I guess it is because gene FBXO11 has no transcript, but if I am only interested on that list of transcripts, this gene should not be included, right?


2 48033890 . CT C 1846.7 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=1.216;ClippingRankSum=0.000;DP=498;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;POSITIVE_TRAIN_SITE;QD=4.44;ReadPosRankSum=-0.195;SOR=0.711;VQSLOD=4.16;culprit=SOR;ANN=C|intron_variant|MODIFIER|MSH6|ENSG00000116062|transcript|ENST00000234420|protein_coding|9/9|c.4002-10delT||||||INFO_REALIGN_3_PRIME,C|intragenic_variant|MODIFIER|FBXO11|ENSG00000138081|gene_variant|ENSG00000138081|||n.48033891delA|||||| GT:AD:DP:GQ:PL 0/1:254,162:416:99:1884,0,3658
snpeff snpsift vcf • 771 views
ADD COMMENTlink modified 19 months ago by Pierre Lindenbaum129k • written 19 months ago by FGV110

Just a thought: maybe a custom GFF3 with just the gene/transcripts needed could be created and imported as a custom annotation database into snpEff. Then the annotation could be run against that database, which would probably save time, too.

ADD REPLYlink written 19 months ago by Vitis2.4k
gravatar for Pierre Lindenbaum
19 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

I wrote a tool that removes the SNPEFF ANN annotation based on a list of gene-name/transcript


 java -jar dist/vcfburdenfiltergenes.jar -a "NM_206933.2" in.vcf > out.vcf
ADD COMMENTlink written 19 months ago by Pierre Lindenbaum129k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 978 users visited in the last hour