Question: How do I reduce the length of a contig?
0
gravatar for kvc0004
2.3 years ago by
kvc00040
United States
kvc00040 wrote:

I have recently assembled my shotgun metagenomic reads using metaSPADES without changing any default settings. I then upload them to MG-RAST for annotation. However, some of the samples have contigs greater than the maximum length cut off (500,000bp). According to the manual, the MG-RAST pipeline has a hard time annotating reads/contigs that contain multiple organisms. If I reduce my K-mer size, would this result in shorter contigs? Or does anyone have any other suggestions for adjusting contig length without manually separating them and compromising coverage statistics?

Thank you!

metagenomics assembly • 977 views
ADD COMMENTlink written 2.3 years ago by kvc00040

The best approach is to use a different program that does not have such a strange and arbitrary limitation. But if you need to observe a 500kbp limit, manually separate them, rather than trying to find parameters that make SPAdes give you a bad assembly. You can use BBMap like this:

reformat.sh in=good_assembly.fa out=bad_assembly.fa fastareadlen=500000

That will break sequences longer than 500kbp into 500kbp pieces.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Brian Bushnell16k

Thank you so much! I will try that!

ADD REPLYlink written 2.3 years ago by kvc00040

I could not find this mention to the contig maximum length here, could you point the link to the manual and page / section mentioning it?

ADD REPLYlink written 2.3 years ago by h.mon28k

It is on page 77 in the Data submission via the web interface section.

ADD REPLYlink written 2.3 years ago by kvc00040

I could not find any mention to a limit on contig size. Are we talking about the same manual?

MG-RAST Manual for version 4, revision 1. October 3nd, 2016

ftp://ftp.metagenomics.anl.gov/data/manual/mg-rast-manual.pdf

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by h.mon28k

My mistake, that is the page referring to the format. I found out about the bp cutoff after uploading the file. It will not allow you to submit files that contain one or more sequences/contigs who are >500kbp. It will not allow you to select these files for submission on the submit page.

Below is the link to the screenshot of the submission page:

https://ibb.co/jiWHc5

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by kvc00040
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1847 users visited in the last hour