Question: Extracted mapped shotgun metagenomic reads to reference genome. SPAdes or metaSPAdes for de-novo assembly?
gravatar for O.rka
12 months ago by
O.rka120 wrote:

I have a reference strain and mapped all of my shotgun metagenomic reads to the reference strain using BBMap.

I extracted the mapped reads and want to create an assembly from this.

I usually use metaSPAdes for this but would SPAdes be better suited for this task? My PI prefers using SPAdes and metaSPAdes but I'm wondering which one would be better for this task in particular since it's only a (semi-)supervised subset of a metagenome.

[Bonus] if there is another assembler that is better suited for this exact task please let me know.

metagenomics assembly de-novo • 535 views
ADD COMMENTlink modified 12 months ago by h.mon27k • written 12 months ago by O.rka120
gravatar for h.mon
12 months ago by
h.mon27k wrote:

Answering directly your question, I think you should use SPAdes, then also probably filter contigs diverging too much from the coverage of the longest contigs (this assumes the longest contigs belong to the strain of interest, which they should, as the reads used for assembly have been enriched for this strain, and remember contigs with rRNA reads in general have abnormally high coverage). But if the strain of interest is rare on your metagenomic sample, your resulting assembly will be very fragmented due to low coverage.

More philosophically:

Looking at some of your recent threads, it seems you have been struggling with the same issue for some days now. From older to more recent:

How to extract reads that match k-mer profiles from a collection of sequences?

How to interpret sam file generated from BBMap?

This thread

Can you assemble with merged paired end reads and unmatched reads as "single ended" reads?

So it seems you have shotgun metagenomics sequencing, but are interested in only one particular strain. It would be helpful you describe the problem in more detail, and you motivation to take this approach. This would help us evaluate if your approach is sound, or if a completely different approach is better.

The approach you have chosen seems to be mapping to a reference strain (a published genome?), and then assembling the genome using just the mapped reads. I wonder if just mapping to the reference strain and examining differences (calling SNPs / indels and structural variants) would be god enough for your purposes? Or assembling the whole metagenome, and then recovering the contigs belonging to the strain of interest?

ADD COMMENTlink written 12 months ago by h.mon27k

Yes, that's exactly what I'm doing: I have downloaded all of the reference genomes for a species, I'm mapping my reads to it with a wide net (BBMap default = 76% identity), getting the mapped reads and assembling these. I'm not necessarily looking for closed genomes but mostly de-novo assemblies of the organism in the samples I have. Is this method appropriate? Would it be better to use k-mer profiles instead of mapping? I planned on manually binning the contigs after to exclude any false positives.

ADD REPLYlink modified 12 months ago • written 12 months ago by O.rka120
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 988 users visited in the last hour