Question: Annotating Mycobacterium tuberculosis VCF file using snpEFF and aNNOVAR
0
gravatar for S AR
10 months ago by
S AR50
Pakistan
S AR50 wrote:

Hi,

I generated my vcf files from GATK pipeline using ploidy 1 as it is a mycobacterium tuberculosis genome. Now i want to annotate my variants using snpEFF and Annovar. I search snpEff database for mtb annotation using:

java -jar snpEff.jar download -v Mycobacterium_tuberculosis

it gave me numerous results showing that it contans the mtb database. Bit I'm not sure which one is mine/reference one that i used to generate the vcf file. My mtb reference genome file looks like this:

>M.tuberculosis_H37Rv NC_000962.3
ttgaccgatgaccccggttcaggcttcaccacagtgtggaacgcggtcgtctccgaacttaacggcgaccctaaggttgacgacggacccagcagtgatgctaatctcagcgctccgctgacccctcagcaaagggcttggctcaatctcgtccagccattgaccatcgtcgaggggtttgctctgttatccgtgccgagcagctttg.............................

I tried buildDbNcbi.sh script from snpEFF to build my own db but it is produced the following error:

Downloading genome NC_000962
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17.7M    0 17.7M    0     0   157k      0 --:--:--  0:01:55 --:--:--  483k
00:00:00        SnpEff version SnpEff 4.3t (build 2017-11-24 10:18), by Pablo Cingolani
00:00:00        Command: 'build'
00:00:00        Building database for 'NC_000962'
00:00:00        Reading configuration file 'snpEff.config'. Genome: 'NC_000962'
00:00:00        Reading config file: /home/sark/snpEff/snpEff.config
00:00:01        done
No sequence found in feature file.
        Trying fasta file '/home/sark/snpEff/./data/genomes/NC_000962.fa'
        Trying fasta file '/home/sark/snpEff/./data/NC_000962/sequences.fa'
java.lang.RuntimeException: Cannot find sequence for 'NC_000962'
        at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryFeatures.sequence(SnpEffPredictorFactoryFeatures.java:467)
        at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryFeatures.addFeatures(SnpEffPredictorFactoryFeatures.java:111)
        at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryFeatures.create(SnpEffPredictorFactoryFeatures.java:330)
        at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.run(SnpEffCmdBuild.java:369)
        at org.snpeff.SnpEff.run(SnpEff.java:1183)
        at org.snpeff.SnpEff.main(SnpEff.java:162)
java.lang.RuntimeException: Error reading file '/home/sark/snpEff/./data/NC_000962/genes.gbk'
java.lang.RuntimeException: Cannot find sequence for 'NC_000962'
        at org.snpeff.snpEffect.factory.SnpEffPredictorFactoryFeatures.create(SnpEffPredictorFactoryFeatures.java:344)
        at org.snpeff.snpEffect.commandLine.SnpEffCmdBuild.run(SnpEffCmdBuild.java:369)
        at org.snpeff.SnpEff.run(SnpEff.java:1183)
        at org.snpeff.SnpEff.main(SnpEff.java:162)
00:00:01        Logging
00:00:02        Checking for updates...
00:00:04        Done.

Then i kept my fasta file in the above mentioned error folder but now it is giving the following error:

Downloading genome NC_000962.3
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17.7M    0 17.7M    0     0   332k      0 --:--:--  0:00:54 --:--:--  447k
curl: (16) Error in the HTTP2 framing layer

Then i thought of using the built in db for MTB so i just renamed my chr names in my file it is: M.tuberculosis_H37Rv And i tried to replace it with the built in one: ERS007734SCcontig000001 Still no success.

It is generating the following error in each variant of the vcf file:

9;ANN=A||MODIFIER|||||||||||||ERROR_OUT_OF_CHROMOSOME_RANGE

Can anyone help me with this please and can anyone tell how to use annovar for same vcf file?

Thank you. :)

annovar snp mtb snpeff annotation • 494 views
ADD COMMENTlink modified 10 months ago • written 10 months ago by S AR50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1375 users visited in the last hour