Redundancy reduction of de novo transcriptome assemblies with Compacta?
0
0
Entering edit mode
16 months ago
Dunois ★ 2.0k

I've been sweating over how best to reduce the redundancy in a de novo transcriptome assembled using Trinity. I found this tool called Compacta (here's the GitHub repository) that seems to be aimed at solving precisely this problem.

Can anyone comment/share their thoughts on this tool? Has anybody used it before? I haven't really found any papers citing it (yet).

de-novo transcriptome redundancy-reduction • 899 views
0
Entering edit mode

Have you tried CD-HIT (LINK) for this application?

0
Entering edit mode

I'm looking at Compacta because I'd rather not use CD-HIT or MMseqs2 if I can.

1
Entering edit mode

You can try it out and let us know :-)

0
Entering edit mode

Well since you mentioned it: I did try it out.

Compacta is easy to install and run, and the paper is very well written (some very important details are strewn all across the paper, its supplement, and a bunch of READMEs relevant to the paper's scripts though). The GitHub repo seems to no longer be maintained but is still accessible.

Only problem is that it seems to have some problems with samtools being unable to parse some BAM header elements correctly. I still haven't gotten past this yet.

But their idea is solid, and I'd give it a go.

0
Entering edit mode

That does not sound like compacta's problem though? Unfortunately if the software is not maintained or is tied to some specific version of samtools then I would not use it. No matter how good it may be.

0
Entering edit mode

Indeed, I don't think it's Compacta itself. I haven't had the chance to debug it properly; some of the samples I ran through it did run fine, and yet some others didn't. And yeah, it's a shame that it's not being actively maintained, despite there being some interest in the tool. Let's hope somebody forks it.