Entering edit mode
2.4 years ago
ttom
▴
230
I am trying to annotate a vcf using annovar using the following command
perl /annovar/table_annovar.pl chr21.vcf.gz annovar/humandb/ -buildver hg38 -out chr21 -remove -protocol refGene,ensGene,esp6500siv2_aa,esp6500siv2_ea,esp6500siv2_all -operation g,g,r,r,r -nastring . -vcfinput --nopolish
I am not getting the output in VCF format, not understanding the error.
Error and log
NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGene -outfile chr21.refGene -exonsort -nofirstcodondel chr21.avinput annovar/humandb/>
NOTICE: Output files are written to chr21.refGene.variant_function, chr21.refGene.exonic_variant_function
NOTICE: Reading gene annotation from annovar/humandb/hg38_refGene.txt ... Done with 88819 transcripts (including 21511 without coding sequence annotation) for 28307 unique genes
NOTICE: Processing next batch with 359444 unique variants in 359444 input lines
NOTICE: Reading FASTA sequences from annovar/humandb/hg38_refGeneMrna.fa ... Done with 647 sequences
WARNING: A total of 606 sequences will be ignored due to lack of correct ORF annotation
-----------------------------------------------------------------
NOTICE: Processing operation=g protocol=ensGene
NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg38 -dbtype ensGene -outfile chr21.ensGene -exonsort -nofirstcodondel chr21.avinput annovar/humandb/>
NOTICE: Output files are written to chr21.ensGene.variant_function, chr21.ensGene.exonic_variant_function
NOTICE: Reading gene annotation from annovar/humandb/hg38_ensGene.txt ... Done with 111108 transcripts (including 39529 without coding sequence annotation) for 47298 unique genes
NOTICE: Processing next batch with 359444 unique variants in 359444 input lines
NOTICE: Reading FASTA sequences from annovar/humandb/hg38_ensGeneMrna.fa ... Done with 626 sequences
WARNING: A total of 415 sequences will be ignored due to lack of correct ORF annotation
-----------------------------------------------------------------
NOTICE: Processing operation=r protocol=esp6500siv2_aa
NOTICE: Running with system command <annotate_variation.pl -regionanno -dbtype esp6500siv2_aa -buildver hg38 -outfile chr21 chr21.avinput annovar/humandb/>
NOTICE: Output file is written to chr21.hg38_esp6500siv2_aa
NOTICE: Reading annotation database annovar/humandb/hg38_esp6500siv2_aa.txt ... Error: invalid record found in region annotation database: <1 69428 69428 T G 0.0037 rs140739101>
Error running system command: <annotate_variation.pl -regionanno -dbtype esp6500siv2_aa -buildver hg38 -outfile chr21 chr21.avinput annovar/humandb/>
Error running system command: <annovar/table_annovar.pl chr21.avinput annovar/humandb/ -buildver hg38 -outfile chr21 -remove -protocol refGene,ensGene,esp6500siv2_aa,esp6500siv2_ea,esp6500siv2_all -operation g,g,r,r,r -nastring . --nopolish -otherinfo>
`
The first error says:
Are you using the same chromosome format for your input VCF and the annotation database? F.e, make sure that if your VCF says
chr1the annotation database also useschr1format instead of1.