Kraken abundances do not add up
1
0
Entering edit mode
7 months ago
jsgounot ▴ 140

When I sum all abundance values from one rank (for example S), it did not match with the root abundance. Why's that ? In some cases, differences can be huge. This can be observed even for high-level rank such as division. I feel like I'm missing something obvious.

Here is the command line I'm using to observe that :

cat report.txt | awk '{ if ( $4 == "D1" ) { sum+=$1 }} END { print sum }'


Thanks !

kraken2 • 210 views
0
Entering edit mode
7 months ago
jsgounot ▴ 140

Ok I got an answer for this. The missing piece was that kmers found in more than one child of a specific node in the taxonomic tree are automatically assigned to the node (last common ancestor). Therefore, you can have more reads mapped in the parent node than the sum of the mapped reads of children, leading to abundance difference. This can be important when, like me, you produce a custom database with closely related sequences.