Sequence clustering and motif identification.
1
0
Entering edit mode
9.0 years ago
gtho123 ▴ 260

I have a alignment of >300 homologous sequences from different samples. All are the same length at around 15,000 bases. I don't expect any to be identical but wish to cluster them and identify the (motifs or individual bases) which distinguish or are more characteristic of each cluster than from any of the others.

I realize I could do some variation on hierarchical clustering but am curious if anyone has any advice on how to proceed.

Any comments appreciated.

Clustering alignment sequence motifs • 2.1k views
ADD COMMENT
0
Entering edit mode
9.0 years ago
dago ★ 2.8k

You could use h-cd-hit to divide them in clusters http://cd-hit.org/

Then use each cluster to discover motif with program of the MEME suits http://meme.nbcr.net/meme/

ADD COMMENT

Login before adding your answer.

Traffic: 2485 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6