1
1
Entering edit mode
22 months ago
DNAngel ▴ 240

I want to use Kraken2 to annotate my unmapped reads but I need to use the nt database from NCBI. The nt database comes in so many parts and I've only seen one command on kraken2 which is:

kraken2-build --standard -db $DBname  Where I would replace$DBname with the full ftp path to the databases. But the nt databases in the ftp server is split into so many parts. So do I have to run this command for like 40+ instances of the nt database? Is there a way for kraken2 to access the nt database all at once and stitch it together? I am having no luck figuring this part out. Thank you for your help.

Kraken2 nt • 4.0k views
0
Entering edit mode

try ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/

0
Entering edit mode

I don't think you will find for download a kraken2 nt database. You may try to find one at Zenodo, there are lots of custom databases deposited there.

The available official databases are found at Kraken2 Index zone, and some useful (but ageing) databases can be found at Loman Lab Mock Community Experiments Databases.

0
Entering edit mode

You can use the GTDB based indices, it's cleaner than nt: https://github.com/hcdenbakker/GTDB_Kraken

0
Entering edit mode

OP: I would replace $DBname with the full ftp path to the databases Kraken2-build automatically builds index and$DBname is a name which you would use for the analysis, downstream (copy/pasted from the manual: Replace "$DBNAME" above with your preferred database name/location). An example would be kraken-build --standard --db standard_kraken_index_folder. But you can also customize (library) what you can download for indexing. Kraken build allows partial indexing as well. Refer here https://github.com/DerrickWood/kraken2/blob/master/scripts/kraken2-build and http://manpages.ubuntu.com/manpages/eoan/en/man1/kraken2-build.1.html As other biostars say that building index requires substantial computational resources, you can download pre-built indices. ADD REPLY 0 Entering edit mode I am still lost as to how to direct my kraken2 commands to the nt database. I get that$DBname now is just a name I choose to call my database, but how am I telling kraken2 to build me my database using nt as the source?

Should I download the nt database somewhere in my folder and refer to it? Then it takes me back to my initial issue where the nt database is split into too many parts.

1
Entering edit mode
22 months ago
GenoMax 122k

Kraken2 manual shows the following option for building nt database. Look under custom databases section (LINK).

kraken2-build --download-library nt --db $DBNAME  This will require a lot of disk space and RAM. So be aware of that. ADD COMMENT 0 Entering edit mode Oh I must have misunderstood, so$DBname is just a name I would create to call my nt database? Or is that where I put the path for the nt database. This is what is confusing me.

0
Entering edit mode

See clarification provided by cpad0112 $DBNAME can be replaced with a name you want/like. Looks like it can include a location/path in addition to name. ADD REPLY 0 Entering edit mode > so$DBname is just a name I would create to call my nt database?


Correct.

> is that where I put the path for the nt database.


No. You don't need to provide any external path for the database.

0
Entering edit mode

Sorry had to go MIA for a while due to family. I am back at it with trying out kraken2. I still don't understand where I'm supposed to direct kraken2-build for the nt database. I understand hat $DBname is just the name I choose, but how do I direct it to download taxonomy from the nt database? ADD REPLY 0 Entering edit mode Ohhh I think the download_taxonomy.sh script already includes by default a link to the NCBI nt database, is this correct? Derp... ADD REPLY 0 Entering edit mode Creation of the custom databases shows the first step where you download the taxonomy information. kraken2-build --download-taxonomy --db$DBNAME

0
Entering edit mode

Right, and then if I wanted to make my own custom database with my own sequences I would just download them manually and add it to my $DBname. I'd use  kraken2-build --add-to-library my_seqs.fasta --db$DBname


where \$DBname would be what I used earlier for nt database