Question: Kraken reads don't seem to add up
gravatar for pvishwa2
23 months ago by
pvishwa20 wrote:

The current study I'm working on requires Kraken classification of paired NGS reads.

To test the data, my supervisor ran a sample on Galaxy, and came up with >90% reads belonging to the categorization he wanted. I tried to replicate those results on a local server using the local version of Kraken (with downloaded and built databases). Once I had my report file, I used this code:

cat | grep "$species_name" | awk '{sum+=$1} END {print sum}'

However, I got an output that suggested only about 20% of the reads mapped to that species. On examination of the report file, I found that although 90% of the reads were categorized as belonging to the node above the "$species_name" I was looking at (the parent node), fewer mapped to the actual "$species_name" itself. Further, the children nodes of the 90% node do not have read percentages that sum up to 90%.

I want to know how to find out if there's an issue with the database, my usage of kraken, or my understanding of the data. What is good practice for someone in my situation?

ADD COMMENTlink written 23 months ago by pvishwa20

The databases can be built with different parameters and different initial genomes. What databases were used for each run? What command-line options?

You don't provide sufficient details for troubleshooting the differences in results.

ADD REPLYlink written 23 months ago by h.mon31k

Thanks for the response, I didn't realize I'd forgotten to provide that. For the locally run Kraken test, I used a pre-built version of the Standard Kraken database. My supervisor used the "Bacteria" database on the Kraken web portal. My kraken call was:

kraken --threads 48 -db $DBNAME --fastq-input --gzip-compressed --paired --check-names foo_R1.fastq.gz foo_R2.fastq.gz > foo.kraken

The translate and report commands were called as is (no additional flags specified beyond inputs and database).

ADD REPLYlink modified 23 months ago • written 23 months ago by pvishwa20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1247 users visited in the last hour