bin reassembly question
1
0
Entering edit mode
8 weeks ago
shevch2009 ▴ 20

Hi all, I have a question about bin reassembly.

I started with shotgun assemblies (MegaHit) and performed binning with metaWRAP ( binning module, bin refinment, taxonomy with gtdb-tk), ending up with some bins and half of them have replicates. By replicates I meant every two bins correspond to the same taxonomic group (same genus or family), but they differ in completeness.

I tried to concatenate contigs from two taxonomically similar bins to reassemble them again using metaWRAP ( binning module, bin refinment, taxonomy with gtdb-tk), hoping to improve bin quality.

However, the reassembled bins ended up with much worse quality: contamination increased and completeness dropped significantly, so bin refinment module won't even find bins with completeness 50, although some bins were 80 and 90 completeness.

Why could this happen? What are the possible explanations? Is this approach is wrong? Almost all bins from concoct looks like just one contig, how come? Are there any way to increase completeness of bins?

Thanks, Best, Alla

data metawrap assembly shotgun • 504 views
ADD COMMENT
1
Entering edit mode
8 weeks ago
Mensur Dlakic ★ 30k

Why could this happen? What are the possible explanations? Is this approach is wrong? Almost all bins from concoct looks like just one contig, how come? Are there any way to increase completeness of bins?

This is a wrong approach. It is completely plausible to have two distinct bins that are taxonomically similar and differ in completeness. That means that two similar organisms are living in the same environment, which should not be surprising.

CheckM can merge the bins, but only when they have complementary sets of marker genes - see the merge option.

https://github.com/Ecogenomics/CheckM

You get increased contamination because these organisms are similar enough that multiple singleton gene copies are present after the re-assembly. Don't know why you get decreased completeness, but most likely that happens because including two similar but non-identical organisms forces the fragmentation during the assembly.

Generally speaking, there is no way to artificially increase the bin completeness. It is what it is, unless you get more sequencing data. There are ways to clean up the bins, so at least the contamination can be addressed.

https://github.com/snayfach/MAGpurify

ADD COMMENT
0
Entering edit mode

Thank you so much for the explanations.

ADD REPLY

Login before adding your answer.

Traffic: 6030 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6