obi import fails for genbank files
0
0
Entering edit mode
4.1 years ago

Hi,

I am trying to build a custom reference database for two marker genes (COI & 16S) from the full NCBI nt database downloaded form here: ftp://ftp.ncbi.nlm.nih.gov/blast/db/

I downloaded all nt files (nt.00 to nt.22) unziped them and am now trying to follow this tutorial https://git.metabarcoding.org/obitools/obitools3/wikis/Wolf-tutorial-with-the-OBITools3 to build a reference database with ecopcr.

However, I fail at the first step when trying to import the files

obi import --genbank-input  nt/nt.00.tar Fabian_Work/refdb

fails with

2020-03-31 18:12:25,809 [import : INFO ]  obi import: imports an object (file(s), obiview, taxonomy...) into a DMS
2020-03-31 18:12:26,070 [import : INFO ]  Opened file: nt/nt.00.tar
2020-03-31 18:14:21,529 [import : INFO ]  Importing 269 entries

Could not import sequence id: b'Alias' (error raised: 'NoneType' object has no attribute 'group' )
Traceback (most recent call last):
  File "python/obitools3/parsers/genbank.pyx", line 40, in obitools3.parsers.genbank.genbankParser
AttributeError: 'NoneType' object has no attribute 'group'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Applications/OBITools/obitools3/obi3-env/bin/obi", line 62, in <module>
    config[root_config_name]['module'].run(config)
File "python/obitools3/commands/import.pyx", line 253, in obitools3.commands.import.run
File "python/obitools3/parsers/genbank.pyx", line 151, in genbankIterator_file
File "python/obitools3/parsers/genbank.pyx", line 56, in obitools3.parsers.genbank.genbankParser
IndexError: list index out of range

Also, the documentation says that

For EMBL files, you can give the path to a directory with several EMBL files.

is that true for genbank files, too? If not, can I somehow import them into the same DMS?

Any help is greatly appreciated.

Fabian

obitools • 1.3k views
ADD COMMENT
0
Entering edit mode

Are you sure this tool is designed to use entire nt database? Can it reads compressed tar files (which is what you are trying to use)?

ADD REPLY
0
Entering edit mode

maybe not? in OBITools there was an obiconvert function that could convert the nt files to ecoPCRdb format. But I haven't seen this function implemented in OBITools3. And I cannot get OBITools installed on mac OSX Catalina (It always fails with wrong python, regardless which Python I use or if I try to install it via anaconda)

ADD REPLY
0
Entering edit mode

If OBITools has any 32-bit code in it, it will not work on macOS catalina. I have a feeling that you are using the wrong input. You could get the "COI" gene sequences from NCBI here and use them as input.

ADD REPLY
0
Entering edit mode

Afaik OBITolls is a python package but the previous version required Python2 and the new version is build for python3. Not sure about 32 bit code. For now, I am trying to build a ref db with the EMBL files instead (as in the tutorial) and will leave the NCBI files unless someone knows how they should be imported. Thanks for your help!

ADD REPLY

Login before adding your answer.

Traffic: 2569 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6