Question: CD-HIT can not complete remove redundant transcript
0
gravatar for xioli2013
2.5 years ago by
xioli20130
xioli20130 wrote:

I tried to use CD-HIT-EST to remove redundancy in trinity de novo transcripts however I can still see redundant annotations,

cd-hit-est -i trinity.fasta -o clstr_out -c 0.9 -n 9

for example:

TRINITY_DN1855_c5_g1, TRINITY_DN1855_c1_g1

all pointing to dnaK, the two sequences aligned at 92% identity but they are not clustered by CD-HIT

>TRINITY_DN1855_c5_g1 
CGCCAAGAAGACCGAGATCTACAGCACCGCCGAAAACAACCAGCCCGGTGTGGAAATCAACGTGCTGCAAGGCAAGCGCC
CCATGGCCGCCGACAACAGGTCCCTGGGCCGCTTCAAGCTCGAGGGCATTCCCCCCATGCCCGCAGGCTGCGCCCAGATC
GAAGTGACCTTCGGTATCGACGCCAACGGCATTCTGCATGTCACCGCCAAGGAAAAGACCAGCAGCAAGGAAAGCAGCAT
CCGCATCGGGAACACCACCACCCTCGACAAGAGTGACGTGGAGCGCATGGTGCAGGAAACCGAGCAGAACGCCGCCGCCG
ACAGGGCCCGCAAGGAGAAGGTCGAGAAACGCAACAACCTCGACTCGCTGCGC
> TRINITY_DN1855_c1_g1
AGGGCGGCATGATTGCCCCGATGGTTACCCGCAACACCACCGTGCCCGTCAAGAAGACCGAGATCTACACCACTGCCGAAAA
CAACCAGCCCGGCGTGAAAATCAACGTGCTGCAAGGCGAGCACCCCATGGCCGCCGACAACAAGTCTCTGGGCCGCTTCAAGCTCGAAGGCGTTCCCCCCATGCCCGCAGGCCGCGTCCAGATCGAAGTGACCTTCGATAT

Trying other parameters as -c 0.89, 0.88 did not reduce the redundancy but actually increased the number of transcripts.

I am writing to hear your comments as to what the problem is and how to address the issue

Thanks,

Xp

trinity cd-hit-est • 1.3k views
ADD COMMENTlink modified 2.5 years ago by lakhujanivijay4.8k • written 2.5 years ago by xioli20130
1

See this thread for further assistance: how to use CD_HIT to remove the redundant sequence from trinity output file

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by genomax78k
1
gravatar for lakhujanivijay
2.5 years ago by
lakhujanivijay4.8k
India
lakhujanivijay4.8k wrote:

There are inherent limitations to CD-HIT algorithm which we should be aware of. Please see this link

Also, see CD-HIT-2D comparing algorithm on the same page

ADD COMMENTlink written 2.5 years ago by lakhujanivijay4.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1811 users visited in the last hour