snpEff: ERROR_CHROMOSOME_NOT_FOUND
1
0
Entering edit mode
4 months ago
ziziqolo ▴ 10

Hey BioStars

I'm trying to annotate my bacteria variants with SnpEff,

java -jar snpEff.jar -v Mycoplasma_hyopneumoniae_168_l /mnt/f/mycopn/variantcalling.vcf > /mnt/f/mycopn/res.ann.vcf

All my Reference and snpEff database are Mycoplasma_hyopneumoniae_168_l, but I got into trouble.

The annotated vcf file contains empty ID column with ERROR_CHROMOSOME_NOT_FOUND 9986. I read about the error, but could not get any hint how to fix it as I'm newbie.

Would you please help me? Best Regards...

variant annotation • 524 views
ADD COMMENT
0
Entering edit mode

what is the output of

grep -v "^#" /mnt/f/mycopn/variantcalling.vcf | cut -f 1 | uniq | sort | uniq

how does it compare to the chromosomes of Mycoplasma_hyopneumoniae_168_l

?

ADD REPLY
0
Entering edit mode

the command you wrote returns: NC_021283.1

and about second part, sorry Did you ask about aligner? I used hisat2.

ADD REPLY
0
Entering edit mode

how did you build/get the snpeff database for Mycoplasma_hyopneumoniae_168_l ?

ADD REPLY
1
Entering edit mode

for example in https://www.ncbi.nlm.nih.gov/nuccore/NC_017509.1 the genome could be named NC_017509.1 in snpeff , and not NC_021283.1

ADD REPLY
0
Entering edit mode

Oh sure, the snpEff downloaded the database itself,

Downloading database for 'Mycoplasma_hyopneumoniae_168_l'
Database installed.
ADD REPLY
1
Entering edit mode
4 months ago

snpeff uses the data from ensembl: http://ftp.ensemblgenomes.org/pub/current/bacteria/species_EnsemblBacteria.txt

>>> 2
$1                #name : Mycoplasma hyopneumoniae 168 (GCA_000183185)
$2              species : mycoplasma_hyopneumoniae_168_gca_000183185
$3             division : EnsemblBacteria
$4          taxonomy_id : 907287
$5             assembly : ASM18318v1
$6   assembly_accession : GCA_000183185.1
$7            genebuild : 2014-05-HuazhongAgriculturalUniversity
$8            variation : N
$9           microarray : N
$10         pan_compara : N
$11     peptide_compara : N
$12   genome_alignments : N
$13    other_alignments : Y
$14             core_db : bacteria_113_collection_core_52_105_1
$15          species_id : 196
$16                 ??? : 
<<< 2

there is a good chance that your chromosome are not named "NC_021283.1" but CP003131.1 (https://www.ncbi.nlm.nih.gov/assembly/GCA_000400855.1) , or something else.

ADD COMMENT
0
Entering edit mode

Thank you so much. and now I have to build my own database with NC_021283.1 name?

ADD REPLY
0
Entering edit mode

you have to discover the name of the chromosome in the snpeff database and change the name of the contig in the VCF using bcftools annotate --rename-chrs ...

ADD REPLY

Login before adding your answer.

Traffic: 1479 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6