CD-HIT minimum sequence length
1
0
Entering edit mode
8.7 years ago
Swatchpuppy ▴ 50

I have run cd-hit on my machine with a set of 1600 sequences, however the program says that only 1523 were read. I have managed to check which sequences were left out (all with less than 15 aa). I have looked all over the documentation and I can't find a minimum sequence length allowed. Is there any way to overcome this limitation?

clustering cd-hit sequence • 3.1k views
ADD COMMENT
3
Entering edit mode
8.7 years ago
SES 8.6k

The minimum is 10 for cd-hit.

-l    length of throw_away_sequences, default 10

Though, there are a lot of other length thresholds you can set, so you might want to check the results against the defaults. For example, there are length difference cutoffs and alignment length thresholds. Those might also be influencing the results.

ADD COMMENT

Login before adding your answer.

Traffic: 1984 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6