snpEff database build error
Entering edit mode
7 weeks ago

Hi everyone, I was trying to build the snpEff database for the Human herpesvirus 5 strain Merlin ( using the script provided by SnpEff (, and I got the following error described in the Error message section. I think the gen-bank file itself probably causes it. A formatting error or something in the gbk file. Is there anyone who encountered a similar problem? How did you overcome it? What do you suggest?

Note: Later, I tried to build the database manually and got the same error. I updated SnpEff to the 5.1 version and tried again. But I got the same error.

I really appreciate any help you can provide.

To Reproduce

SnpEff version: 5.0

Genome version: AY446894.2

SnpEff full command line: bash ~/path-to-script/ AY446894.2

Output / Error message: java.lang.RuntimeException: Error reading file '/path-to-data/data/AY446894.2/genes.gbk' java.lang.RuntimeException: Transcript 'HHV5wtgr002' is already in Gene 'HHV5wtgr002'

Expected behavior: Building database

Annotation Database GenBank SnpEff • 317 views
Entering edit mode

It seems the annotation contains two genes (probably identical?) at different positions (6759..8458 and 8250..8393), but with same name (RL9A) and locus_tag (HHV5wtgr002). My guess is snpEff wants unique names for the genes and transcripts.

Entering edit mode

Thank you for your input. I believe you guessed it correctly. I have deleted redundant entries in the GenBank file. I am not sure that was the right approach, but that worked. Also, I was not interested in those regions anyways.


Login before adding your answer.

Traffic: 1468 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6