paired-reads assembly problem with so many contigs
1
0
Entering edit mode
9.6 years ago
alebuenosm ▴ 20

Hi. We are assembly a genome of chromobacterium spp. We have 2 fastq files, one Forward and other Reverse. The fastaq files are ilumina reads 300bp. What we need is to assembly these files into contigs and then generate scaffolds. The issue comes now:

We used CLC bio to generate contigs and there were generated 78 contigs. Next we needed to put these contigs into the right order ( the order they should be in the original genome ). Using a reference genome we tried this, but failed because CLC do not allows contigs longer than 99.000bp for this task and we had longer. So we come up using mauve , starting from the contigs generated by CLC and Mauve did the Job, but with undesirable gaps.

We read articles and found SPADES. Spades assembled the F and R fastq into contigs and scaffolds, generating 282 contigs and scaffolds ( against 78 of CLC ), because SPADES generated small contigs some lesser then 200 . so... My question is:

How can we generate contigs longer than 1000bp so that we end up with smaller number of contigs? we didn't see a parameter on SPADES to do this and CLC is out of question because we used a trial version and now... what options rest?

Note: the hard drive was formated and there are no longer contigs and we are starting from fastq again

Thank you since now for you spending time on it.

Assembly • 3.2k views
ADD COMMENT
3
Entering edit mode
9.6 years ago
rtliu ★ 2.2k

AlignGraph, an algorithm for extending and joining de novo-assembled contigs or scaffolds guided by closely related reference genomes. https://github.com/baoe/AlignGraph. I would give it a try.

ADD COMMENT
0
Entering edit mode

Thank you rtliu, I'll try it.

ADD REPLY
0
Entering edit mode

This is better suited as a comment to rtliu's answer than an answer by itself

EDIT: I've moved it there now.

ADD REPLY

Login before adding your answer.

Traffic: 1615 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6