Question: CD-HIT minimum sequence length
gravatar for Swatchpuppy
4.7 years ago by
Swatchpuppy50 wrote:

I have run cd-hit on my machine with a set of 1600 sequences, however the program says that only 1523 were read. I have managed to check which sequences were left out (all with less than 15 aa). I have looked all over the documentation and i can't find a minimum sequence length allowed. Is there any way to overcome this limitation?

cd-hit clustering sequence • 1.7k views
ADD COMMENTlink written 4.7 years ago by Swatchpuppy50
gravatar for SES
4.7 years ago by
Vancouver, BC
SES8.3k wrote:

The minimum is 10 for cd-hit.

-l    length of throw_away_sequences, default 10

Though, there are a lot of other length thresholds you can set, so you might want to check the results against the defaults. For example, there are length difference cutoffs and alignment length thresholds. Those might also be influencing the results.


ADD COMMENTlink written 4.7 years ago by SES8.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1949 users visited in the last hour