Mapping contigs back to samples from co-assembled metagenome.
0
0
Entering edit mode
7 weeks ago

In an attempt to increase the quality of our metagenomic assembly and ensure we capture low abundance species we are co assembling ~20 samples. However after the assembly we need to perform some analysis based on the samples, thus I need to map contigs back to a samplewhere reads map to that contig from said sample. Is there a tool that can do this (I cant find one) or will I have to do it myself? If so at that coverage of a contig should I assign said contig to a sample?

coassembly • 538 views
0
Entering edit mode

thus I need to map contigs back to a genome where reads map to that contig from a sample

minimap2 should do the job

0
Entering edit mode

so just any old aligner and then above a certain coverage threshold assign that contin to that sample? What coverage should I use though?

0
Entering edit mode

Sorry

maybe you edited the question

thus I need to map contigs back to a samplewhere reads map to that contig from said sample.

Help me to understand, you did a co-assembly from 20 samples which resulted in n contigs. Now you want to know from which sample each contig was assembled? Is that right?

Also,

What coverage should I use though?

You are looking for a threshold that can be used to assign contigs to each sample. Is that right?

0
Entering edit mode

Apologies I did edit it! I should have noted that in the question.

Yes both of your summaries are correct.

0
Entering edit mode

Since you did a co-assembly of 20 samples, the most likely scenario is that the final contigs are the result of reads coming from multiple samples. By using any short read aligner you can easily calculate the coverage per sample of each contig.

What coverage should I use though?

Unfortunately, I am not aware of any method/tool that use a coverage threshold to assign contigs back to samples. If that was your primary goal perhaps the co-assembly strategy was not the best choice.

0
Entering edit mode

Unfortunately, I am not aware of any method/tool that use a coverage threshold to assign contigs back to samples. If that was your primary goal perhaps the co-assembly strategy was not the best choice.

Do you know of another method by which we can do this?

We are trying to recover MAGs from the metagenomes and unfortunately have a lot of host in the raw reads so this was our attempt to get good MAGs. I will try assemble by sample also though!

0
Entering edit mode

Most binning tools use differential coverage and other stats to cluster contigs into bins and eventually MAGs. Right know you have everything you need to recovers MAGs from the co-assembly.

unfortunately have a lot of host in the raw reads so this was our attempt to get good MAGs.

I suppose you did not remove the host reads because you do not have a good reference genome. If that is the case, binning tools 'should' be able to discriminate host contigs from the rest

0
Entering edit mode

I suppose you did not remove the host reads because you do not have a good reference genome. If that is the case, binning tools 'should' be able to discriminate host contigs from the rest

We do de-host. it is just that a lot of the coverage we paid for ends up being on the host so if we assemble and bin by sample we are worried we will miss a lot of low abundance species.

Most binning tools use differential coverage and other stats to cluster contigs into bins and eventually MAGs. Right know you have everything you need to recovers MAGs from the co-assembly.

Unfortunately for the purpose of the study a by sample analysis is needed as it is a comparison across samples. e mapping a contig to a sample should not confound this though I don't think?