Hi all,
I did transcriptome assembly on illumina PE, 100 bp reads at different k-mers, and now try to merge resulting assemblies. So, I pool them and subjected to cd-hit-est tool to remove redundant sequences; then I used cap3 on non-redundant sequences with default setting. But based on cap3 output, many of sequences are as singlets instead of contigs. It sounds that cap3 could not efficiently find the overlapping sequences. Could you please let me know what's wrong here, or there is any setting to improve the work? Thanks for sharing your experience.
Hi Brian, thanks, but how I can use it for my purpose if it don't merge overlapping contigs?
Ah, well, it was not clear what your purpose was. If you specifically want overlapping contigs to get merged, Dedupe is not the correct tool unless you post-process the output. You might try Minimus2.
That said, when we run Minimus2, we always run Dedupe first because it greatly reduces the input volume, which makes Minimus2 take less time and be less likely to crash.
thanks, I try it.