Question

How does Kraken2 Work?

1

Entering edit mode

4.5 years ago

bk2070 ▴ 10

I've read the manual, but I'm still confused. Does Kraken2 break up the reads you give it into 31-32 KMERS and looks for matches against its database? If that's the case, does this mean it gives a couple nucleotide mismatches? Also, how does this work with its confidence scoring parameter? If I increase this, will it make it more stringent?

Last but not least, if I were to assemble my metagenome shotgun dataset into contigs, would it even be worth using Kraken2 if its still going to break the contigs into 31-32KMERS.

Cheers.

kraken2 • 4.5k views

ADD COMMENT • link updated 4.5 years ago by Asaf 10k • written 4.5 years ago by bk2070 ▴ 10

score 3 · Answer 1 · 2019-10-16

3

Entering edit mode

4.5 years ago

Asaf 10k

Yes, it breaks the reads into all the possible kmers. A perfect hit will have all the kmers mapped, if you have one mismatch it will impact all kmers that contain this nucleotide.
The confidence score is basically the number of matching kmers divided by the total number of kmers.
You can use kraken2 for contigs but you can use other methods which will give more precise results potentially.

ADD COMMENT • link 4.5 years ago by Asaf 10k

0

Entering edit mode

Asaf, to your third point, what other methods for contigs would you recommend? I use direct mapping of contigs as an alternative, but that doesn't give me LCA classifications when there's no species hit, which is what I like about Kraken.

ADD REPLY • link 4.3 years ago by BioinformaticsLad ▴ 200

2

Entering edit mode

I like GTDB-tk : https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btz848/5626182

ADD REPLY • link 4.3 years ago by Asaf 10k