Many unclassified reads using Kraken2, is it normal?
Hi all, I ran kraken2 analysis on my dataset (8 samples from 2 different human gut micrbiome projects) and got above 50% no hits. I tried to ran those unclassified reads with Mataphlan2, and didn't get any classification either. Can you suggest a reason for that many unclassified reads, can I avoid that? Thanks in advance Michal

How long are the reads?

We tried 250bp and 150bp reads

I am not familiar with kraken2 or even metaphlan but there can be many reasons. It also depends on your per-processing steps. Think the common reasons are

• The DNA is just not sequenced yet
• Your database is incomplete (most common)
• The input data just contains a lot of non-coding DNA or no marker genes

Think if you want an answer from an expert you need to tell if you did any filtering before and which reference you are using.

First thanks for your help. I think its technical since the sample were prepared in different countries, so it is less likely they share the same contamination, I also tried to blast some sequence and got low alignment score, so I guess its not the db but i will try to run some more sequences to be extra sure it is not the problem. I wondered if it is common to have such low detection rate... Thanks again, Michal

You can blast and have a perfect hit but not all parts of the DNA can be used to determine a species.

I didn't filter the data and used the Kraken2 db.

Think it will always help do to at least some quality trimming

