Things to do with the clustered groups of proteins from orthoMCL tool???
0
0
Entering edit mode
9.7 years ago
JstRoRR ▴ 60

Hi,

I am trying to create a substitution matrix for some bacterial species. To achieve this, as a preliminary step, I have performed a search for orthologs (protein sequences) using orthoMCL. The program outputs a group.txt file containing groups of sequence clustered on the basis of similarity. Now the next step would be to pair wise align orthologs and create a substitution matrix.

My question is, should I consider taking orthologs grouped by the program in the groups.txt file or should I consider paired orthologs present in pairs folder(orthologs.txt) in the output????

genome • 3.8k views
ADD COMMENT
0
Entering edit mode

Be warned that OrthoMCL does not output the singleton clusters ie. some genes are unique.

Other tools you might want to consider are ProteinOrtho5 and kClust and cd-hit which can do similar things (and much faster).

ADD REPLY
0
Entering edit mode

Thanx Torst for providing alternate tools. I will go through them.

I have one more question, if you have any idea about, is there any tool available which can be used to create a substitution matrix from multiple alignments? Or I have to do it manually or write a fresh script for that?

ADD REPLY
0
Entering edit mode

OrthoMCL does provides a perl script to extract Singletons

ADD REPLY
0
Entering edit mode

OrthoMCL just gives the list of sequences not present in the groups file. but the accuracy of being a true singleton is skeptical, as these remaining sequences may have some orthologs within them.

ADD REPLY

Login before adding your answer.

Traffic: 1976 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6