Question: Distinguishing sequencing reads as prokaryotic or eukaryotic without a reference genome
gravatar for darinshrewsberry1994
5 months ago by
darinshrewsberry19940 wrote:

Hi all,

I've got sequencing data from the microbiome of a eukaryote that does not have a reference genome. I have performed plenty of pre-sequencing steps to exclude as much eukaryotic DNA as possible however, I still wish to determine if any made it through after sequencing and assembly. What could I do to at least classify the reads as eukaryotic vs prokaryotic?


sequencing assembly genome • 203 views
ADD COMMENTlink modified 5 months ago by h.mon25k • written 5 months ago by darinshrewsberry19940

can you elaborate a little on what you all have done already?

From the top of my head there is not much you can do I think

ADD REPLYlink written 5 months ago by lieven.sterck5.1k

I extracted the guts of the organism, then placed them in a digestion cocktail to create a single celled suspension, I then filtered it to help break up any clumps. I stained the sample to prepare it for Flourescent cell sorting, we size separated cells to exclude anything larger than 5uM . ideally, this should get rid of the eukaryotic cells thus most if not all of the DNA, however there could be free floating DNA from cells that may have burst. So we checked that with qPCR to quantify the levels of the host DNA before and after sorting. We did see a decrease. So we proceeded with sequencing and assembly. This is the first time we've went through this entire process as a whole. so once we received the assembly stats, my PI wanted one final check after the meta genome assembly to see if there were any eukaryotic reads still present. The problem is that there isn't a reference genome for the eukaryotic organism we're doing this experiment with. When we run this again in the future we're likely going to run a DNAse treatment after cell sorting to degrade the free floating DNA that could be there.

ADD REPLYlink modified 5 months ago • written 5 months ago by darinshrewsberry19940

Just run all the reads you have through something like centrifuge or kraken and it'll fairly quickly identify whats what to a reasonably resolution.

It may even let you segregate just the ones you want too but I'm not 100%.

ADD REPLYlink written 5 months ago by jrj.healey12k

We did run a Kraken analysis and had around 25% characterization, but we're not sure what of the uncharacterized is host or just bacteria that don't exists in the database. Given that our qPCR results suggested that we had little to no host DNA in our sample right before we sent it off for sequencing, we were a little stumped.

ADD REPLYlink written 5 months ago by darinshrewsberry19940
gravatar for h.mon
5 months ago by
h.mon25k wrote:

There are several tools for this task, I personally like BlobTools for assembled draft genomes. Here is what you get:


I am plagiarizing myself (Interpreting mapping contaminants):

I like to use BlobTools (blasting against NCBI NT) to explore the taxonomic assignment of an assembly, and detect possible contamination - that is, I check for contaminants post-assembly.

You can also use sketches to analyse contamination either on your raw data (pre-assembly) or on assemblies, see:

Mash Screen: what's in my sequencing run?

What’s in my metagenome?

Tool: BBSketch - A Tool for Rapid Sequence Comparison

Finally, you can also use kmer screening tools like Kraken or Centrifuge to screen and filter out contaminants.

ADD COMMENTlink modified 5 months ago • written 5 months ago by h.mon25k

Thanks, I'll give this a look.

ADD REPLYlink written 5 months ago by darinshrewsberry19940
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 660 users visited in the last hour