Question: CD-HIT minimum sequence length
0
gravatar for Swatchpuppy
4.7 years ago by
Swatchpuppy50
Portugal
Swatchpuppy50 wrote:

I have run cd-hit on my machine with a set of 1600 sequences, however the program says that only 1523 were read. I have managed to check which sequences were left out (all with less than 15 aa). I have looked all over the documentation and i can't find a minimum sequence length allowed. Is there any way to overcome this limitation?

cd-hit clustering sequence • 1.7k views
ADD COMMENTlink written 4.7 years ago by Swatchpuppy50
3
gravatar for SES
4.7 years ago by
SES8.3k
Vancouver, BC
SES8.3k wrote:

The minimum is 10 for cd-hit.

-l    length of throw_away_sequences, default 10

Though, there are a lot of other length thresholds you can set, so you might want to check the results against the defaults. For example, there are length difference cutoffs and alignment length thresholds. Those might also be influencing the results.

 

ADD COMMENTlink written 4.7 years ago by SES8.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1949 users visited in the last hour