How can run cd-hit-est with a clstr threshold less than 0.8?
2
0
Entering edit mode
6.4 years ago
m.koohi.m ▴ 120

Dear friends,

I try to run cd-hit-est with a cluster threshold less than 0.8 but every time I get the following error:

Fatal Error: invalid clstr threshold, should >=0.8 Program halted !!

I tried:

cd-hit-est -i seq.fasta -o out.fasta  -d 0 -T 10 -g 1 -M 10000 -c 0.6 -n 4

This command does not have any problem with cd-hit. The following command works well:

cd-hit -i seq.fasta -o out.fasta  -d 0 -T 10 -g 1 -M 10000 -c 0.6 -n 4

Am I missing something?

Thank you

cd-hit cluster • 3.9k views
ADD COMMENT
1
Entering edit mode
5.4 years ago
NPalopoli ▴ 290

According to the cd-hit-est manual you should use one of the following combinations of threshold (-c) and word size (-n):

-n 10, 11 for thresholds 0.95 ~ 1.0
-n 8,9 for thresholds 0.90 ~ 0.95
-n 7 for thresholds 0.88 ~ 0.9
-n 6 for thresholds 0.85 ~ 0.88
-n 5 for thresholds 0.80 ~ 0.85
-n 4 for thresholds 0.75 ~ 0.8

It escapes to me if lower thresholds are allowed and I don't have the correct input data at hand to try for myself (BTW, you should provide a sample dataset that would allow others to replicate the error).

ADD COMMENT
0
Entering edit mode
29 days ago
Asaf 10k

For future reference

The 0.8 identity threshold for EST (nucleotides) is hardcoded. However, there's an option to use -D (distance) instead of -c (identity threshold). For some reason I couldn't find it in the documentation and couldn't figure out how it's being calculated.

ADD COMMENT

Login before adding your answer.

Traffic: 2093 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6