Question: Binning Tools for Long Reads/Contigs
0
gravatar for vijinim
4 months ago by
vijinim90
vijinim90 wrote:

Majority of the currently available metagenomics binning tools are designed to work with short reads and contigs obtained from short reads.

Does someone know if there are any tools available to bin long reads or contigs obtained from long reads?

Thank you very much! :)

ADD COMMENTlink modified 4 months ago • written 4 months ago by vijinim90
1

I think Kraken (and possibly centrifuge) can take long reads. Kraken I’m fairly sure can work on contigs too.

ADD REPLYlink written 4 months ago by jrj.healey13k

Thank you very much. I will try it and see. :)

ADD REPLYlink written 4 months ago by vijinim90

What is the difference between binning long contiguous sequences assembled from short reads and binning long contiguous sequences obtained from long reads?

ADD REPLYlink written 4 months ago by 5heikki8.4k

I believe there is no difference apart from the effects of the error rates of short reads and long reads.

However, I tried to bin a simulated dataset of reads from 2 bacterial genomes (with 20kb - 21kb read lengths and 10% error rate) and the tool failed to identify two bins. It produced only one bin with a few sequences and most of the remaining sequences were not binned. The tool used is MaxBin 2.2.4

ADD REPLYlink modified 4 months ago • written 4 months ago by vijinim90

And how different where the two genomes? No tool will successfully separate e.g. Escherichia coli O157:H7 Sakai and Escherichia coli O157:H7 EC4115..

ADD REPLYlink written 4 months ago by 5heikki8.4k

I used Escherichia coli CFT073 and Staphylococcus aureus JP080. When we get short reads and bin the contigs, MaxBin produces 2 bins with good results.

Similarly, I tried MaxBin with long reads from the same 2 genomes but it gave only 1 bin.

ADD REPLYlink written 4 months ago by vijinim90

Does maxbin use also depth of coverage? That could be the reason as you don't get that dimension with long reads..

ADD REPLYlink written 4 months ago by 5heikki8.4k
1

In this approach, tetranucleotide frequencies and scaffold coverages are combined to organize metagenomic sequences into individual bins, which are predicted from initial identification of marker genes in assembled sequences.

..

Despite careful selection of initialization conditions, the EM algorithm sometimes may still group scaffolds from several composite genomes into one bin. To alleviate this problem, all bins are recursively checked for the median number of marker genes. If the median number of marker genes of any bin is at least 2, the bin will be treated as a dataset waiting to be binned, and the whole EM algorithm will be applied to split the bin.

In case MaxBin works at the protein level for the detection of those marker genes, I think your 10% simulated error rate will lead to a single bin..

ADD REPLYlink modified 4 months ago • written 4 months ago by 5heikki8.4k

Yes. I think this is the issue. I will find another software to do binning. Thank you very much for your insights and explanations. :)

ADD REPLYlink written 4 months ago by vijinim90
0
gravatar for vijinim
4 months ago by
vijinim90
vijinim90 wrote:

I found a tool named MEGAN-LR which can bin metagenomic long reads and contigs. Although it is based on taxonomical binning, I'm going to try it and see.

Thank you all for your insights and ideas. :)

ADD COMMENTlink modified 4 months ago • written 4 months ago by vijinim90

There seems to be no proper tool to do de novo (taxonomy independent) binning of long reads. All available methods are based on taxonomical binning.

ADD REPLYlink modified 9 weeks ago • written 3 months ago by vijinim90

Can you not use the information that you received from alignment by LAST for functional binning (assuming that is what you need apart from taxonomical binning)?

ADD REPLYlink written 7 weeks ago by drishti0

I'm not sure. But what I'm looking for is an alignment-free binning tool, and it seems there is no such tool for long reads at the moment.

ADD REPLYlink written 5 weeks ago by vijinim90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 696 users visited in the last hour