Hi guys, I have about 300,000 sequences stored in a fasta file. I am trying to reduce the redundancy of these sequences. I used CD-HIT-EST to remove the redundancy at 95% similarity threshold and am planning to further remove the redundancy with other tools. I tried tgicl but it seems to be a very old and buggy tool, which didn't work well on my fasta file. I am wondering if there are other DNA clustering tools that serve this purpose. Any recommendations?