Question: Assembly Of 454 Reads From Metagenomic
1
gravatar for pmenzel
7.8 years ago by
pmenzel310
pmenzel310 wrote:

Hi Biostar,

I am looking into assembly of 454 reads from a metagenomic sample into contigs for protein prediction (homology, de-novo gene finding).
In the papers and data sets that I looked at so far, people mostly focus on phylotyping and thus mostly rely on the raw reads. In cases where they assemble the reads, the assemblies are mediocre (huge percentage of singletons, only few contigs >2000bp). N50 barely exceeds the avg read length.

Now my question is, why are the assemblies so bad? I assume, that the coverage that is provided by a single 454 run (giving ~1m reads) is too low and together with 454's error model, newbler has a hard time to find enough overlaps.. I also tried mira assembler on one data set, but it's more or less the same result. Also velvet didn't work better at all on these reads.

So, does somebody has a suggestion on how to improve assembly? Another software? Have more runs and thus higher coverage?

I am grateful for your suggestions. Thanks!

assembly metagenomics • 2.6k views
ADD COMMENTlink written 7.8 years ago by pmenzel310

What's your estimated coverage of the target genome? Generally, high coverage yields better assemblies

ADD REPLYlink written 7.8 years ago by mylons130

once you consider the complexity of the problem - assembling reads from an unknown number of potentially similar genomes - mapped via short and increasingly noisy reads - the classic style assembly is bound to not work correctly

ADD REPLYlink written 7.8 years ago by Istvan Albert ♦♦ 81k
2
gravatar for Casey Bergman
7.8 years ago by
Casey Bergman18k
Athens, GA, USA
Casey Bergman18k wrote:

Have you tried metAMOS yet?

ADD COMMENTlink written 7.8 years ago by Casey Bergman18k
1
gravatar for Manu Prestat
7.8 years ago by
Manu Prestat3.9k
Marseille, France
Manu Prestat3.9k wrote:

First, it depends on what is your metagenome (in paticular the biodiversity in it). If it is native soil, or ocean, or gut, the assembly would not have the same coverage result. We also worked on this question (for soil), you can find some useful information in it.

ADD COMMENTlink modified 7.8 years ago • written 7.8 years ago by Manu Prestat3.9k

The target would ultimately be enriched cultures (very few species), and samples with less diversity (extremophiles). So I guess, that would help me a lot. I was just thinking about some technical aspects that I missed..

ADD REPLYlink written 7.8 years ago by pmenzel310

Great! You are "lucky" to assemble in this context! ;-)

ADD REPLYlink written 7.8 years ago by Manu Prestat3.9k
0
gravatar for Eddie Belter
7.8 years ago by
United States
Eddie Belter0 wrote:

Why not try alignment to a set of references?

ADD COMMENTlink written 7.8 years ago by Eddie Belter0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1133 users visited in the last hour