Custom Database in Kraken2: how are plasmids in same fasta file as bacterial genome handled?
Entering edit mode
15 months ago
Raphaela ▴ 10

Hi everyone,

I am creating a custom database in order to analyse paired reads found in stool samples. My project only looks at the abundances of bacterial species of the Lactobacillaceae family, therefore I downloaded all respective .fna files from the NCBI databank.

When inspecting the .fna files I found many of them containing several genomes, often one complete genome of the bacterial strain and additionally one or two plasmid genomes of the same bacterial strain. For my analysis I am only interested in the complete genomes, however if I filter out "plasmid" I seem to lose lots of .fna files which also contain valuable bacterial genomes.

Does anybody have experience in how Kraken2 handles these plasmid genomes? Are they processed individually or is the end result one abundance score for the respective bacterial strain?

Thank you so much for your help!

BW, Rapha

Kraken2 plasmid CustomDB • 554 views
Entering edit mode
15 months ago

The tool matches sequence ids to taxonomic ids. The table file connects the two pieces of information.

A plasmid sequence is no different than any segmented genome that is present on multiple chromosomes.

What you will get at the end is how many reads are assigned to a taxonomical level, it is not important how many chromosomes/plasmids there were.

Entering edit mode

Amazing, that's good news, thank you so much for your help!


Login before adding your answer.

Traffic: 1832 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6