Question: How to separate or bin a mix of eukaryote and prokaryote DNA?
0
gravatar for jamesT
4.8 years ago by
jamesT30
United States
jamesT30 wrote:

I have several contigsĀ (~1 kb - 20 kb) from a metagenomic sample which, according to marker gene analysis, belong primarily to one eukaryote or several prokaryotes both bacterial and archaeal. All of the contigs have a very similar GC content.

What is the best way to separate the eukaryote DNA from the prokaryote without using references (eg, blasting each contig to a reference won't work, most of the contigs are too far removed from any reference)? Codon biases? Looking for certain genomic features?

sequencing metagenomics genome • 2.6k views
ADD COMMENTlink modified 4.8 years ago by thackl2.7k • written 4.8 years ago by jamesT30
0
gravatar for 5heikki
4.8 years ago by
5heikki8.6k
Finland
5heikki8.6k wrote:

How about an ESOM visualization? Here's a nice resource. Also check e.g. Alignment-free Visualization of Metagenomic Data by Nonlinear Dimension Reduction.

ADD COMMENTlink modified 4 weeks ago by RamRS24k • written 4.8 years ago by 5heikki8.6k

I've been running MaxBin (paper) on our latest metagenomic assemblies and could not be happier with the results. Unlike in e.g. the ESOM approach, with MaxBin you do not need to provide dozens of parameters nor select bins by hand. In essence, MaxBin estimates the number of bins through marker gene analysis and then scaffolds are binned on the basis of coverage and tetranucleotide frequency. Highly recommended. I've been thinking of combining IDBA-UD (initial assembly), MaxBin (binning) and PRICE (targeted assembly of bins) for a really kick ass pipeline..

ADD REPLYlink modified 4.6 years ago • written 4.6 years ago by 5heikki8.6k
0
gravatar for thackl
4.8 years ago by
thackl2.7k
MIT
thackl2.7k wrote:

Have a look at MGTAXA. It does a fairly good job in assigning taxonomic ranks to contigs. You might however need to train a custom model.

Alternatively you can distinguish sequences by coverage (either read mapping or kmer) and additionally GC. Blobology uses this kind of approach - and also makes nice plots.

ADD COMMENTlink written 4.8 years ago by thackl2.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1021 users visited in the last hour