How can run cd-hit-est with a clstr threshold less than 0.8?
Entering edit mode
6.4 years ago
m.koohi.m ▴ 120

Dear friends,

I try to run cd-hit-est with a cluster threshold less than 0.8 but every time I get the following error:

Fatal Error: invalid clstr threshold, should >=0.8 Program halted !!

I tried:

cd-hit-est -i seq.fasta -o out.fasta  -d 0 -T 10 -g 1 -M 10000 -c 0.6 -n 4

This command does not have any problem with cd-hit. The following command works well:

cd-hit -i seq.fasta -o out.fasta  -d 0 -T 10 -g 1 -M 10000 -c 0.6 -n 4

Am I missing something?

Thank you

cd-hit cluster • 3.9k views
Entering edit mode
5.4 years ago
NPalopoli ▴ 290

According to the cd-hit-est manual you should use one of the following combinations of threshold (-c) and word size (-n):

-n 10, 11 for thresholds 0.95 ~ 1.0
-n 8,9 for thresholds 0.90 ~ 0.95
-n 7 for thresholds 0.88 ~ 0.9
-n 6 for thresholds 0.85 ~ 0.88
-n 5 for thresholds 0.80 ~ 0.85
-n 4 for thresholds 0.75 ~ 0.8

It escapes to me if lower thresholds are allowed and I don't have the correct input data at hand to try for myself (BTW, you should provide a sample dataset that would allow others to replicate the error).

Entering edit mode
29 days ago
Asaf 10k

For future reference

The 0.8 identity threshold for EST (nucleotides) is hardcoded. However, there's an option to use -D (distance) instead of -c (identity threshold). For some reason I couldn't find it in the documentation and couldn't figure out how it's being calculated.


Login before adding your answer.

Traffic: 2093 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6