Question: What's the best way to find out 1:1 ortholog presented in multiple speceis
0
gravatar for yuanfeifuzzy
17 months ago by
yuanfeifuzzy0 wrote:

I am trying to find a way to get all 1:1 ortholog presented in multiple species from OMA database, but unfortunately, I only can get orthologs presented in all interested species, while have no way to make sure they are 1:1 ortholog. Anyone can help me with this?

By saying 1:1 ortholog in multiple species, I mean something like this (an example from OrthoDB):

http://orthodb.org/?level=40674&species=40674&universal=1&singlecopy=1

The result showed in that page is what I want: for all 41 mammals, there are 837 ortholog groups, each group have exact 41 genes and each gene is single-copy gene in each species.

ortholog oma • 798 views
ADD COMMENTlink modified 17 months ago by Christophe Dessimoz410 • written 17 months ago by yuanfeifuzzy0
2
gravatar for Christophe Dessimoz
17 months ago by
University College London
Christophe Dessimoz410 wrote:

OMA groups (http://omabrowser.org/oma/landOMA/) contain sets of genes in which each pair are orthologous to one another. Consequently these groups contain at most one sequence per species. However, these sequence are not necessarily single-copy genes and thus the relationships are not all necessarily 1:1. But depending on what you are trying to achieve (e.g. marker genes for phylogenetic tree inference), these groups may be just as good or even better than sets of 1:1 orthologs.

If you really need 1:1 ortholog groups, we don't provide this specific output, but I can think of two way these could be reconstructed with a bit of post-processing:

1) retrieve pairwise orthology relationship files, remove all non 1:1 relationships, and then build groups using the transitivity of 1:1 orthology.

2) retrieve all hierarchical orthologous groups (HOGs) at the level of the mammals, and verify that there are at most one sequence per species. The HOGs that pass this filter are very likely to be 1:1 orthologs. For this, you can download all HOGs (http://omabrowser.org/All/oma-hogs.orthoXML.gz) and parse the file using the "Family Analyzer" tool currently under development in our lab: https://github.com/DessimozLab/familyanalyzer

ADD COMMENTlink written 17 months ago by Christophe Dessimoz410

Thank you Dessimoz! The first method you recommend seems like what I thought. But considering the pairwise orthology was deposited in a single gz file that is larger than 10 Gb, parse that file only for mammals may not be efficient. I will try to use HOGs first, if necessary I may filter pairwise orthology later.

ADD REPLYlink written 17 months ago by yuanfeifuzzy0

To retrieve the pairwise relationships between specific species pairs, you can also use the SOAP API (http://omabrowser.org/oma/APISOAP/). Have a look at the sample Python or Perl code.

ADD REPLYlink written 17 months ago by Christophe Dessimoz410
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 962 users visited in the last hour