Question

Assembly Of 454 Reads From Metagenomic

1

Entering edit mode

12.2 years ago

pmenzel ▴ 310

Hi Biostar,

I am looking into assembly of 454 reads from a metagenomic sample into contigs for protein prediction (homology, de-novo gene finding).
In the papers and data sets that I looked at so far, people mostly focus on phylotyping and thus mostly rely on the raw reads. In cases where they assemble the reads, the assemblies are mediocre (huge percentage of singletons, only few contigs >2000bp). N50 barely exceeds the avg read length.

Now my question is, why are the assemblies so bad? I assume, that the coverage that is provided by a single 454 run (giving ~1m reads) is too low and together with 454's error model, newbler has a hard time to find enough overlaps.. I also tried mira assembler on one data set, but it's more or less the same result. Also velvet didn't work better at all on these reads.

So, does somebody has a suggestion on how to improve assembly? Another software? Have more runs and thus higher coverage?

I am grateful for your suggestions. Thanks!

metagenomics assembly • 3.7k views

ADD COMMENT • link updated 12.2 years ago by Eddie Belter • 0 • written 12.2 years ago by pmenzel ▴ 310

0

Entering edit mode

What's your estimated coverage of the target genome? Generally, high coverage yields better assemblies

ADD REPLY • link 12.2 years ago by mylons ▴ 130

0

Entering edit mode

once you consider the complexity of the problem - assembling reads from an unknown number of potentially similar genomes - mapped via short and increasingly noisy reads - the classic style assembly is bound to not work correctly

ADD REPLY • link 12.2 years ago by Istvan Albert 100k

score 2 · Answer 1 · 2012-02-08

2

Entering edit mode

12.2 years ago

Casey Bergman 18k

Have you tried metAMOS yet?

ADD COMMENT • link 12.2 years ago by Casey Bergman 18k

score 1 · Answer 2 · 2012-02-08

1

Entering edit mode

12.2 years ago

Manu Prestat 4.1k

First, it depends on what is your metagenome (in paticular the biodiversity in it). If it is native soil, or ocean, or gut, the assembly would not have the same coverage result. We also worked on this question (for soil), you can find some useful information in it.

ADD COMMENT • link 12.2 years ago by Manu Prestat 4.1k

0

Entering edit mode

The target would ultimately be enriched cultures (very few species), and samples with less diversity (extremophiles). So I guess, that would help me a lot. I was just thinking about some technical aspects that I missed..

ADD REPLY • link 12.2 years ago by pmenzel ▴ 310

0

Entering edit mode

Great! You are "lucky" to assemble in this context! ;-)

ADD REPLY • link 12.2 years ago by Manu Prestat 4.1k

score 0 · Answer 3 · 2012-02-09

0

Entering edit mode

12.2 years ago

Eddie Belter • 0

Why not try alignment to a set of references?

ADD COMMENT • link 12.2 years ago by Eddie Belter • 0