Question

What do you do with the data processed after running metaspades/spades?

0

Entering edit mode

3.0 years ago

DNAngel ▴ 250

I have metagenomic data so I am using kraken2 which processed my reads and mapped them against various databases to see what species I have. This is great and what I wanted.

But, I need to compare the results using metaSPADES, but all I can find in metaSPADES is how to get the contigs.fasta file which I understand is my main output file. What is the next step? Do I simply BLAST these contigs (which would take sooo long because there are millions of contigs)? I think I should remap my reads against the contigs to get coverage but how do I get taxonomy information? Does metaSPADES have a builtin database like kraken2 does that I am just somehow missing here (I don't see any flags or options when using the help function).

Any tips on this would be great!

metaspades spades • 2.5k views

ADD COMMENT • link updated 3.0 years ago by andres.firrincieli 3.6k • written 3.0 years ago by DNAngel ▴ 250

score 0 · Answer 1 · 2021-04-22

0

Entering edit mode

3.0 years ago

andres.firrincieli 3.6k

metaSPADES does not have a builtin database for taxonomic analysis. It is just an assembler developed for metagenomic sequencig data. Tools like MEGAN, MMseqs2 or even kraken2 can be used for the taxonomic classification your contigs.

ADD COMMENT • link 3.0 years ago by andres.firrincieli 3.6k

0

Entering edit mode

Oh okay so it just assembles the data. Does MEGAN do binning then of the contigs? I guess my goal would be to group all similar contigs per sample and see what species show up in each sample type. I can't imagine doing it for each individual contig (hundreds of thousand contigs) for 100s of samples; seems like more processing would be needed?

ADD REPLY • link 3.0 years ago by DNAngel ▴ 250

0

Entering edit mode

Also, metaSPADES just produces contigs so how do I get it to produce MAGS that I can analyze? Is this something MEGAN can do? I'll have to read up on it more.

ADD REPLY • link 3.0 years ago by DNAngel ▴ 250

0

Entering edit mode

Does MEGAN do binning then of the contigs?

MEGAN does a taxonomic (or functional) binning of your contigs using a lowest common ancestor (LCA) algorithm

Also, metaSPADES just produces contigs so how do I get it to produce MAGS that I can analyze? Is this something MEGAN can do? I'll have to read up on it more.

MEGAN is not a tool for the reconstruction of putative genomes (MAGs) from contigs. For MAGs ,I use CONCOT, MetaBAT2 and MaxBin2 and a final step with metaWRAP for bin refinement. Usually, binning tools for MAGs reconstruction do not use the taxonomic information for binning contigs into MAGs

Taxonomic binning and MAGs are two different things

ADD REPLY • link 3.0 years ago by andres.firrincieli 3.6k

0

Entering edit mode

I see...okay so please tell me if my thought process here makes sense. If I just want to find out the species in my samples and their relative abundance, would reconstructing MAGS make more sense? My understanding is that with the MAGs, I would then have created my "reference" genomes that I can assemble my trimmed reads against. This makes sense to me as it is somewhat using a de novo approach to create a "reference" for me to use...but I am unsure how that would give me abundance. Unless, each mapped read would be a "count" and that is my abundance. This is my thinking I hope I am on the right track.

ADD REPLY • link 3.0 years ago by DNAngel ▴ 250

0

Entering edit mode

mOTUs does something like that but using single copy marker genes predicted from your contigs or MAGs.

Keep in mind that not all your contigs will be binned into MAGs. Binning algorithms for MAGs requires contigs with a lenght > 2.5 Kbp. Second, certain taxa can be much easier to bin in MAGs as comapred to others. Therefore, if you focus only on medium/high-quality MAGs (completness > 50% and contamination < 10%), you could underestimates the complexity of your community.

I am not saying that focusing only MAGs to calculate taxa abundance and diversity is wrong, but you should understand what the limitation are. There are a lot of high quality papers about this topic.

ADD REPLY • link 3.0 years ago by andres.firrincieli 3.6k