Question: Kraken 2 Strain Level Classification?
0
gravatar for psun
3 months ago by
psun0
psun0 wrote:

Hello,

I have a question regarding the Kraken 2 classifier, and maybe you would be able to let me know if I am thinking about this incorrectly. For Kraken 2, to build our own custom database, we need the following (Here is the reference):

1) Install a taxonomy (NCBI)

2) Install one or more reference libraries (we can also include our own sequences in this step using FASTA files)

3) Build the database using certain Kraken 2 command

For Kraken 2, to add other genomes for step 2, the documentation says to have sequences in FASTA or multi-FASTA files. Each sequence ID in the file(s) should also contain an NCBI accession number or an explicit assignment to a taxid. If I had my own database that has a column of strain sequences, strain names, and another column with the matching NCBI accession number, I would I be able to add these sequences to step 2 by making my own FASTA file from this information.

Would it be possible to get Kraken 2 to classify reads that match these strains from our own custom database? (Kraken 2 documentation says that it does not classify reads at the strain level)

I suppose I'm more confused about why some tools only allow for classification to the species level when we can make your own database that provides sequences at the strain level (unless the classifier tool is not able to look up the strain information from NCBI to be able to classify the reads properly)? Please let me know if there is any gap in my understanding.

Thank you.

UPDATE: Kraken 2 allows for strain level if you use your own custom database as long as the kmers are unique enough to classify at the strain level.

classifier strain kraken 2 • 233 views
ADD COMMENTlink modified 12 weeks ago • written 3 months ago by psun0

Answer was found in Bracken 2 GitHub issues

For this reason the post is now closed.

Cheers!

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by psun0

Closing a post is an action used by moderators for other reasons. If you found an answer for your question then please post the complete GitHub link as an answer so anyone finding this post in future will be able to get the answer.

ADD REPLYlink written 12 weeks ago by genomax92k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1211 users visited in the last hour