How do you download the nt database for Kraken2?
1
1
Entering edit mode
3.2 years ago
DNAngel ▴ 250

I want to use Kraken2 to annotate my unmapped reads but I need to use the nt database from NCBI. The nt database comes in so many parts and I've only seen one command on kraken2 which is:

kraken2-build --standard -db $DBname

Where I would replace $DBname with the full ftp path to the databases. But the nt databases in the ftp server is split into so many parts. So do I have to run this command for like 40+ instances of the nt database? Is there a way for kraken2 to access the nt database all at once and stitch it together? I am having no luck figuring this part out. Thank you for your help.

Kraken2 nt • 7.0k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

I don't think you will find for download a kraken2 nt database. You may try to find one at Zenodo, there are lots of custom databases deposited there.

The available official databases are found at Kraken2 Index zone, and some useful (but ageing) databases can be found at Loman Lab Mock Community Experiments Databases.

ADD REPLY
0
Entering edit mode

You can use the GTDB based indices, it's cleaner than nt: https://github.com/hcdenbakker/GTDB_Kraken

ADD REPLY
0
Entering edit mode

OP: I would replace $DBname with the full ftp path to the databases

Kraken2-build automatically builds index and $DBname is a name which you would use for the analysis, downstream (copy/pasted from the manual: Replace "$DBNAME" above with your preferred database name/location). An example would be kraken-build --standard --db standard_kraken_index_folder. But you can also customize (library) what you can download for indexing. Kraken build allows partial indexing as well. Refer here https://github.com/DerrickWood/kraken2/blob/master/scripts/kraken2-build and http://manpages.ubuntu.com/manpages/eoan/en/man1/kraken2-build.1.html

As other biostars say that building index requires substantial computational resources, you can download pre-built indices.

ADD REPLY
0
Entering edit mode

I am still lost as to how to direct my kraken2 commands to the nt database. I get that $DBname now is just a name I choose to call my database, but how am I telling kraken2 to build me my database using nt as the source?

Should I download the nt database somewhere in my folder and refer to it? Then it takes me back to my initial issue where the nt database is split into too many parts.

ADD REPLY
1
Entering edit mode
3.2 years ago
GenoMax 141k

Kraken2 manual shows the following option for building nt database. Look under custom databases section (LINK).

kraken2-build --download-library nt --db $DBNAME

This will require a lot of disk space and RAM. So be aware of that.

ADD COMMENT
0
Entering edit mode

Oh I must have misunderstood, so $DBname is just a name I would create to call my nt database? Or is that where I put the path for the nt database. This is what is confusing me.

ADD REPLY
0
Entering edit mode

See clarification provided by cpad0112 $DBNAME can be replaced with a name you want/like. Looks like it can include a location/path in addition to name.

ADD REPLY
0
Entering edit mode
> so $DBname is just a name I would create to call my nt database?

Correct.

> is that where I put the path for the nt database.

No. You don't need to provide any external path for the database.

ADD REPLY
0
Entering edit mode

Sorry had to go MIA for a while due to family. I am back at it with trying out kraken2. I still don't understand where I'm supposed to direct kraken2-build for the nt database. I understand hat $DBname is just the name I choose, but how do I direct it to download taxonomy from the nt database?

ADD REPLY
0
Entering edit mode

Ohhh I think the download_taxonomy.sh script already includes by default a link to the NCBI nt database, is this correct? Derp...

ADD REPLY
0
Entering edit mode

Creation of the custom databases shows the first step where you download the taxonomy information.

kraken2-build --download-taxonomy --db $DBNAME
ADD REPLY
0
Entering edit mode

Right, and then if I wanted to make my own custom database with my own sequences I would just download them manually and add it to my $DBname. I'd use

  kraken2-build --add-to-library my_seqs.fasta --db $DBname

where $DBname would be what I used earlier for nt database

ADD REPLY

Login before adding your answer.

Traffic: 2629 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6