visualize CD-HIT output file
1
0
Entering edit mode
2.5 years ago
m90 ▴ 30

Hello there ,

It is my first time to use CD-HIT tool for clustering , so my output file like below, I'm wondering if any script or tool I can use it in linux to see output as graph?

>Cluster 0
0       15679nt, >SpecA_Contig35475... at +/99.99%
1       15436nt, >SpecA_Contig35476... at +/99.62%
2       15764nt, >SpecB_Contig18540... *
3       15438nt, >SpecA_Contig39392... at +/99.69%
4       15679nt, >SpecC_comp263440_c8_seq4... at -/99.99%
>Cluster 1
0       15684nt, >SpecC_SB1234_Contig35474... at +/99.98%
1       15685nt, >SpecC_Contig11682... *
>Cluster 2
0       15684nt, >SpecA_comp263440_c8_seq3... at -/99.98%
1       15672nt, >SpecB_comp263440_c8_seq5... at -/99.97%
CD-HIT visualization • 1.3k views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.
code_formatting

ADD REPLY
0
Entering edit mode
2.5 years ago
Mensur Dlakic ★ 27k

What type of graph do you have in mind? Something showing the number of clusters? Average number of cluster members? Average length of cluster sequences?

Whatever it is, you may want to start from this script, which will convert the CD-HIT output into a clustering solution. It should be easier to create a graph of any kind from it.

https://github.com/jrjhealey/bioinfo-tools/blob/master/ParseCDHIT.py

ADD COMMENT

Login before adding your answer.

Traffic: 1496 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6