ANNOVAR doesn't recognize multiple variants at single locus
1
0
Entering edit mode
3.1 years ago
j.lunger18 ▴ 30

Hi there,

I'm using ANNOVAR to annotate variants from somatic sequencing data. I first annotate my VCF files with snpEff, which accurately picks up on mutliple variants at a single locus. I then annotate with ANNOVAR, which seemingly is unable to handle the comma separating variants, and only annotates the first in a comma delimited list of variants. Here is an example: I have in order, chromosome, position, REF, ALT, snpEff_Allele, and REVEL score. The REVEL score should be 0.545 for row 1, and 0.698 for row 2 and 3.

chr4    118705648   G   A,C  A      0.545    
chr4    118705648   G   A,C  C      0.545
chr4    118705648   G   C      C      0.698

According to documentation, ANNOVAR should be able to handle this kind of format but seemingly not? This is the code I used to run ANNOVAR.

module load annovar/2018-04-16
perl $ANNOVAR_HOME/table_annovar.pl /path/to/file.vcf  $ANNOVAR_DATA/hg38 \
-buildver hg38 -out /path/to/output.vcf \
-protocol exac03nontcga,gnomad_genome,gnomad_exome,esp6500siv2_all,dbnsfp33a,revel,clinvar_20170130,intervar_20180118 -operation f,f,f,f,f,f,f,f \
-nastring . -vcfinput
SNPs ANNOVAR snpEff SNVs • 997 views
ADD COMMENT
2
Entering edit mode
3.1 years ago
desouzareis.r ▴ 280

Hello,

You should decompose and normalize your vcf file before annotation. You can use bcftools or vt. You can find more inoformation here.

bcftools norm -m-both -o ex1.step1.vcf ex1.vcf.gz
bcftools norm -f human_g1k_v37.fasta -o ex1.step2.vcf ex1.step1.vcf
ADD COMMENT

Login before adding your answer.

Traffic: 2805 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6