Entering edit mode
8.3 years ago
Nitha
▴
20
Hi All,
I have compared 2 whole protein (human with bacteria), cd-hit2d program was performed with 0.7 (70%), I have got some result. I'm not able to analyse the result. I have to check the result and take non-homologous sequence..can anyone help me to find it..
Thanks
You're going to have to elaborate on "Im not able to analyse". CD HIT is a clustering tool. Unless outliers were filtered, or there were homologous outliers, you will find them in singleton clusters. Start with the singletons and work your way upwards to maybe 2- or 3-sized clusters.
Thanks Ram, for replying!
I have got the out put
db2.cluster
sorted and I have to take the Accession number id, to retrieve the sequence. taking id manually from the result for big data its takes time.. If I am not wrong, i have to take each id of from matched number then followed novel one..How to take this accession number separately..wtr there is any method or program.. plz guide me