Question: Finding orthologs in a organism with a genome duplication.
gravatar for bio.erikson
6 months ago by
bio.erikson20 wrote:

I'm trying to map human orthologs to the allotetraploid Xenopus Laevis. I've tried RBH Blast, and looked at a number of other Ortholog finding softwares; but, they all seem to work on finding a one-to-one relationships. This, is problematic for a Laevis, which has 2 copies of almost every gene. Can anyone recommend a workflow that is capable of handling many-to-one relationships, or have any bright ideas?

I've also tried Xenbase's manually curated human orthology, but I need ensembl ids. And converting from xenbase to entrez to ensembl is very messy.

ADD COMMENTlink modified 6 months ago • written 6 months ago by bio.erikson20

It depends on your workflow. If you are using FASTa sequences as a starting point, you only need to filter out near-identical sequences which will hopefully get rid of all duplicated proteins. This can be done using CD-HIT:

cd-hit -i input.fas -o input.95 -c 0.95

When two or more sequences share >=95% identity, this program will remove everything but the longest sequence in that cluster. After this step you do the orthology finding as usual.

ADD REPLYlink written 6 months ago by Mensur Dlakic4.0k

Two problems, The WGD is ancient, many of the duplicated genes have low sequence identity, > 70%. But still have functionally identical roles. And I need to know the human ortholog for both gene copies.

ADD REPLYlink written 6 months ago by bio.erikson20

CD-HIT can cluster at 70% identity, and even down to 40%.

When you find a human ortholog for one of the two protein copies, presumably you have found it for the other copy as well. It is a simple functional transfer. CD-HIT creates .clustr files which tell you what proteins were grouped together.

ADD REPLYlink written 6 months ago by Mensur Dlakic4.0k

I am developing a software fot finding local alignments. Could you please tell me one (or more) of the sequences from the Xenoplus Laevis? I'd like to check if the results could be helpful for you. I have had success aligning some highly diverged species. Then maybe I could think a worlflow...

ADD REPLYlink modified 6 months ago • written 6 months ago by juanjo75es70

You can use OrthoFinder, which will give you one-to-one, one-to-many, many-to-one, and many-to-many.

ADD REPLYlink written 6 months ago by Mehmet510
gravatar for Christophe Dessimoz
6 months ago by
University College London
Christophe Dessimoz650 wrote:

You can use OMA for this. If your genomes are in OMA, you can use the pairwise orthology function (

In your case, since Xenopus Laevis is not yet in OMA, you could use OMA standalone, which will produce pairwise ortholog files which include also 1:many and many:many relationships.

ADD COMMENTlink modified 6 months ago by RamRS26k • written 6 months ago by Christophe Dessimoz650
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 786 users visited in the last hour