Entering edit mode
6.7 years ago
kvc0004
•
0
I have recently assembled my shotgun metagenomic reads using metaSPADES without changing any default settings. I then upload them to MG-RAST for annotation. However, some of the samples have contigs greater than the maximum length cut off (500,000bp). According to the manual, the MG-RAST pipeline has a hard time annotating reads/contigs that contain multiple organisms. If I reduce my K-mer size, would this result in shorter contigs? Or does anyone have any other suggestions for adjusting contig length without manually separating them and compromising coverage statistics?
Thank you!
The best approach is to use a different program that does not have such a strange and arbitrary limitation. But if you need to observe a 500kbp limit, manually separate them, rather than trying to find parameters that make SPAdes give you a bad assembly. You can use BBMap like this:
That will break sequences longer than 500kbp into 500kbp pieces.
Thank you so much! I will try that!
I could not find this mention to the contig maximum length here, could you point the link to the manual and page / section mentioning it?
It is on page 77 in the Data submission via the web interface section.
I could not find any mention to a limit on contig size. Are we talking about the same manual?
MG-RAST Manual for version 4, revision 1. October 3nd, 2016
ftp://ftp.metagenomics.anl.gov/data/manual/mg-rast-manual.pdf
My mistake, that is the page referring to the format. I found out about the bp cutoff after uploading the file. It will not allow you to submit files that contain one or more sequences/contigs who are >500kbp. It will not allow you to select these files for submission on the submit page.
Below is the link to the screenshot of the submission page:
https://ibb.co/jiWHc5