why capt3 tool cannot efficiently find the overlapping contigs?
2
0
Entering edit mode
8.7 years ago
seta ★ 1.9k

Hi all,

I did transcriptome assembly on illumina PE, 100 bp reads at different k-mers, and now try to merge resulting assemblies. So, I pool them and subjected to cd-hit-est tool to remove redundant sequences; then I used cap3 on non-redundant sequences with default setting. But based on cap3 output, many of sequences are as singlets instead of contigs. It sounds that cap3 could not efficiently find the overlapping sequences. Could you please let me know what's wrong here, or there is any setting to improve the work? Thanks for sharing your experience.

Assembly sequencing RNA-Seq genome • 2.6k views
ADD COMMENT
0
Entering edit mode
8.7 years ago

Hi seta,

I suggest that you give Dedupe a try. It will find and remove all duplicate contigs and fully-contained contigs, like this:

dedupe.sh in=assm1.fa,assm2.fa out=combined.fa

It can also find and report all overlaps, in dot format. It won't remove or merge overlapping contigs, though.

ADD COMMENT
0
Entering edit mode

Hi Brian, thanks, but how I can use it for my purpose if it don't merge overlapping contigs?

ADD REPLY
0
Entering edit mode

Ah, well, it was not clear what your purpose was. If you specifically want overlapping contigs to get merged, Dedupe is not the correct tool unless you post-process the output. You might try Minimus2.

That said, when we run Minimus2, we always run Dedupe first because it greatly reduces the input volume, which makes Minimus2 take less time and be less likely to crash.

ADD REPLY
0
Entering edit mode

thanks, I try it.

ADD REPLY
0
Entering edit mode
8.7 years ago
h.mon 35k

Some contigs will be assembled only at one particular kmer - low kmers assemble lots of small contigs. These sequences may be missing entirely from assemblies with other kmers, so there is nothing to be done for them and CAP3 shows them as singlets.

ADD COMMENT

Login before adding your answer.

Traffic: 3148 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6