Question: Annotate VCF on specific transcripts (ANN field)
0
gravatar for FGV
3 months ago by
FGV100
FGV100 wrote:

Dear all,

I've been using snpEff and snpSift for a while to annotate and filter my VCF files. Usually I just annotate my VCF with snpEff but, as it is, snpEff annotates all transcripts. Recently I wanted to restrict my annotations (ANN) to a list of ~150 transcripts, but could not find a way to do it. Is it possible? As a workaround, I just re-annotated the VCF with snpEff the "-onlyTr" option, but it is annoying to have to annotate a VCF twice. Is there a way to just filter the annotations?

However, even with the "-onlyTr" option, I still get some annotations that should have been removed. In the example below, it annotates a variant on both MSH6 and FBXO11 genes, even though I only specified the transcript ENST00000234420. I guess it is because gene FBXO11 has no transcript, but if I am only interested on that list of transcripts, this gene should not be included, right?

thanks

2 48033890 . CT C 1846.7 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=1.216;ClippingRankSum=0.000;DP=498;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;POSITIVE_TRAIN_SITE;QD=4.44;ReadPosRankSum=-0.195;SOR=0.711;VQSLOD=4.16;culprit=SOR;ANN=C|intron_variant|MODIFIER|MSH6|ENSG00000116062|transcript|ENST00000234420|protein_coding|9/9|c.4002-10delT||||||INFO_REALIGN_3_PRIME,C|intragenic_variant|MODIFIER|FBXO11|ENSG00000138081|gene_variant|ENSG00000138081|||n.48033891delA|||||| GT:AD:DP:GQ:PL 0/1:254,162:416:99:1884,0,3658
snpeff snpsift vcf • 179 views
ADD COMMENTlink modified 3 months ago by Pierre Lindenbaum118k • written 3 months ago by FGV100

Just a thought: maybe a custom GFF3 with just the gene/transcripts needed could be created and imported as a custom annotation database into snpEff. Then the annotation could be run against that database, which would probably save time, too.

ADD REPLYlink written 3 months ago by Vitis2.0k
0
gravatar for Pierre Lindenbaum
3 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum118k wrote:

I wrote a tool that removes the SNPEFF ANN annotation based on a list of gene-name/transcript http://lindenb.github.io/jvarkit/VcfBurdenFilterGenes.html

e.g:

 java -jar dist/vcfburdenfiltergenes.jar -a "NM_206933.2" in.vcf > out.vcf
ADD COMMENTlink written 3 months ago by Pierre Lindenbaum118k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1341 users visited in the last hour