Different results in finding human sequences from metagenomes (bowtie2 and Kraken2)
1
0
Entering edit mode
8 weeks ago
kmat • 0

I’d like to remove human reads from human gut metagenomes. Many studies conduct bowtie2 to human genome and retain only unmapped reads. I did so, but many reads that did not mapped to human genome were annotated as human with Kraken2 in the following step.

Then I conducted both bowtie2 against hg38 and kraken2 against the standard Kraken2 database, using public data. The result is as follows. Although the human sequences were only a small on these samples, much more reads were considered human sequences with Kraken2.

sample total_read bowtie2_hg38 kraken_human both only_bowtie2 only_kraken
no1 17549939 300 4034 240 60 3794
no2 17053678 112 3067 85 27 2982
no3 16735960 365 5121 343 22 4778
no4 19546779 123 5109 114 9 4995

Do you have any idea about the difference between the results of bowtie2 and Kraken2? And do you have any other suggestions on how to remove human sequences from metagenomes?

Thanks.

metagenomics metagenome • 658 views
ADD COMMENT
4
Entering edit mode
ADD REPLY
0
Entering edit mode

Thank you for reply. I'll try it.

ADD REPLY
2
Entering edit mode

Welcome to metagenomics :-)

Seriously though, try playing with bowtie2 parameters to make it more permissive.

ADD REPLY
0
Entering edit mode

Thank you for your reply :-) I wondered if the bowtie2 results were too strict or if the Kraken2 results were too permissive... If anyone knows some articles comparing host removal methods, let me know.

Thanks.

ADD REPLY
0
Entering edit mode

I moved this to a comment as it is not an answer.

ADD REPLY
0
Entering edit mode

If I don't recall badly, bowtie2 relies in a true alignment, whereas Kraken uses pseudoalignment. This, in addition to the many options you have in bowtie2, can make a difference

ADD REPLY
0
Entering edit mode
8 weeks ago

I would also stick to a more tuned and standard approach rather than trying to bake your own and running into the problems you have noticed. The one I consider most well known would be kneaddata

https://github.com/biobakery/kneaddata

ADD COMMENT

Login before adding your answer.

Traffic: 1429 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6