How to separate or bin a mix of eukaryote and prokaryote DNA?
2
0
Entering edit mode
9.2 years ago
jamesT ▴ 30

I have several contigs (~1 kb - 20 kb) from a metagenomic sample which, according to marker gene analysis, belong primarily to one eukaryote or several prokaryotes both bacterial and archaeal. All of the contigs have a very similar GC content.

What is the best way to separate the eukaryote DNA from the prokaryote without using references (eg, blasting each contig to a reference won't work, most of the contigs are too far removed from any reference)? Codon biases? Looking for certain genomic features?

sequencing metagenomics genome • 3.9k views
ADD COMMENT
0
Entering edit mode
9.2 years ago
5heikki 11k

How about an ESOM visualization? Here's a nice resource. Also check e.g. Alignment-free Visualization of Metagenomic Data by Nonlinear Dimension Reduction.

ADD COMMENT
0
Entering edit mode

I've been running MaxBin (paper) on our latest metagenomic assemblies and could not be happier with the results. Unlike in e.g. the ESOM approach, with MaxBin you do not need to provide dozens of parameters nor select bins by hand. In essence, MaxBin estimates the number of bins through marker gene analysis and then scaffolds are binned on the basis of coverage and tetranucleotide frequency. Highly recommended. I've been thinking of combining IDBA-UD (initial assembly), MaxBin (binning) and PRICE (targeted assembly of bins) for a really kick ass pipeline..

ADD REPLY
0
Entering edit mode
9.2 years ago
thackl ★ 3.0k

Have a look at MGTAXA. It does a fairly good job in assigning taxonomic ranks to contigs. You might however need to train a custom model.

Alternatively you can distinguish sequences by coverage (either read mapping or kmer) and additionally GC. Blobology uses this kind of approach - and also makes nice plots.

ADD COMMENT

Login before adding your answer.

Traffic: 2503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6