sorting clusters from cd hit
0
0
Entering edit mode
16 months ago
Nabil • 0

Greetings

I have used cd-hit to find out sequence similarity between proteins now i need to sort them here is an example

>Cluster 0
0    287aa, >CM_M_XP_007408389.1... at 62.72%
1    293aa, >CM_M_XP_007408535.1... *
>Cluster 1
0    291aa, >CM_TX_POW04575.1... at 100.00%
1    292aa, >CM_ST_POW09224.1... *
>Cluster 2
0    286aa, >CM_PG_KAA1076669.1... *
>Cluster 3
0    285aa, >CM_M_XP_007406760.1... *
>Cluster 4
0    278aa, >CM_PS_KNZ46184.1... *
1    275aa, >CM_PT_OAV90755.1... at 58.91%
>Cluster 5
0    241aa, >CM_PG_KAA1096683.1... at 51.04%
1    266aa, >CM_PG_KAA1096686.1... *
2    236aa, >CM_PG_KAA1113276.1... at 50.85%
3    262aa, >CM_PG_KAA1113279.1... at 94.66%
4    241aa, >CM_CRL_EFP86512.1... at 50.62%
5    248aa, >CM_ST_POW02451.1... at 52.02%
>Cluster 6
0    251aa, >CM_PS_KNZ44295.1... *
>Cluster 7
0    236aa, >CM_PG_KAA1083848.1... at 88.98%
1    250aa, >CM_PG_KAA1119265.1... *
2    250aa, >CM_CRL_EHS63005.1... at 100.00%
>Cluster 8
0    250aa, >CM_PS_KNZ57382.1... *
>Cluster 9
0    236aa, >CM_PG_KAA1105946.1... *
1    236aa, >CM_PG_KAA1114903.1... at 97.46%
2    236aa, >CM_CRL_EFP74682.1... at 97.46%
3    235aa, >CM_TX_POW15956.1... at 84.68%
4    235aa, >CM_ST_POW04000.1... at 84.68%
5    232aa, >CM_PT_OAV92548.1... at 88.79%

Is there a way where I can sort them according to the number of proteins beneath each cluster without doing it manually?

Here's an image in case the text is messed up:

image of text

cd-hit • 602 views
ADD COMMENT
0
Entering edit mode

What is the question here?

I formatted the list you had included at top properly (with 10101 code option in editor) but it looks to be similar to the screenshot you posted.

ADD REPLY

Login before adding your answer.

Traffic: 2132 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6