phylogenetic tree after ortholog finding by orthoMCL
2
0
Entering edit mode
8.2 years ago
Mehmet ▴ 820

Dear All:

I have completed finding ortholog proteins (proteome data) of 16 species by using orthoMCL tool. After that, I want to make a phylogenetic tree based on orthologs proteins among those species. Could you please help how to do that? For instance, I have IDs of orthologs and how can I get related proteins from my own protein data (in orthoMCL output)? and use them for the phylogenetic tree?

Thank you for taking a time.

genome alignment sequence gene blast • 4.5k views
ADD COMMENT
0
Entering edit mode

Strictly speaking, you should be inferring orthologs from the phylogenetic tree, not the other way around. The procedure to build a tree is to first construct a multiple sequence alignment then use one of several methods to reconstruct the tree. This tutorial is a good introduction.

ADD REPLY
0
Entering edit mode

May be he wants to build phylogeny based on conserved genes.

ADD REPLY
0
Entering edit mode

Possibly but by definition, orthologs can only be inferred from a tree so you can't call your sequences, however similar they are, orthologs until you've built the tree. Using pairwise alignments only is a short cut that can make mistakes. For example, between species paralogs wouldn't be detected and incorrectly called orthologs.

ADD REPLY
1
Entering edit mode
8.2 years ago
Naren ▴ 990

BPGA can process OrthoMCL output to generate a 1,0 binary matrix and also processes the same to form Concatenated Orthologous core genes to construct UPGMA or NJ phylogeny on that basis.

It's not that big proteome. so you may recluster your proteome using much faster USEARCH (included in BPGA as default). It should not take more than 5 minutes.

ADD COMMENT
0
Entering edit mode
8.2 years ago
Mehmet ▴ 820

Hi,

After having found orthologs by orthoMCL tool, I have to isolate single copy genes from output of orthoMCL and extract these single copy genes` sequences to build a phylogenetic tree. I completed orthoMCL step, but I need help after that. How to get single copy genes and their protein sequences?

ADD COMMENT
0
Entering edit mode

For getting single copy genes, you have to filter your OrthoMCL output such that : The Orhologous cluster/ group (each line) contain genes (gene header/id) from all the genomes under study. but not including more than one gene from any genome. Those gene ids will help you extract protein sequences from the individual protein files by some means (maybe a small perl script). Even if you find few such clusters you can build alignment by concatenating them and generate tree based on that. The tool BPGA I mentioned is capable of doing trees on orthologous sequences but not single copy genes.

ADD REPLY

Login before adding your answer.

Traffic: 2701 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6