Question

Redundancy reduction of de novo transcriptome assemblies with Compacta?

0

Entering edit mode

3.0 years ago

Dunois ★ 2.5k

I've been sweating over how best to reduce the redundancy in a de novo transcriptome assembled using Trinity. I found this tool called Compacta (here's the GitHub repository) that seems to be aimed at solving precisely this problem.

Can anyone comment/share their thoughts on this tool? Has anybody used it before? I haven't really found any papers citing it (yet).

de-novo transcriptome redundancy-reduction • 1.4k views

ADD COMMENT • link 3.0 years ago by Dunois ★ 2.5k

0

Entering edit mode

Have you tried CD-HIT (LINK) for this application?

ADD REPLY • link 3.0 years ago by GenoMax 141k

0

Entering edit mode

I'm looking at Compacta because I'd rather not use CD-HIT or MMseqs2 if I can.

ADD REPLY • link 3.0 years ago by Dunois ★ 2.5k

1

Entering edit mode

You can try it out and let us know :-)

Good to know about it though. Thanks for posting the link.

ADD REPLY • link 3.0 years ago by GenoMax 141k

0

Entering edit mode

Well since you mentioned it: I did try it out.

Compacta is easy to install and run, and the paper is very well written (some very important details are strewn all across the paper, its supplement, and a bunch of READMEs relevant to the paper's scripts though). The GitHub repo seems to no longer be maintained but is still accessible.

Only problem is that it seems to have some problems with samtools being unable to parse some BAM header elements correctly. I still haven't gotten past this yet.

But their idea is solid, and I'd give it a go.

ADD REPLY • link 3.0 years ago by Dunois ★ 2.5k

0

Entering edit mode

That does not sound like compacta's problem though? Unfortunately if the software is not maintained or is tied to some specific version of samtools then I would not use it. No matter how good it may be.

ADD REPLY • link 3.0 years ago by GenoMax 141k

0

Entering edit mode

Indeed, I don't think it's Compacta itself. I haven't had the chance to debug it properly; some of the samples I ran through it did run fine, and yet some others didn't. And yeah, it's a shame that it's not being actively maintained, despite there being some interest in the tool. Let's hope somebody forks it.

ADD REPLY • link 3.0 years ago by Dunois ★ 2.5k