OMA gene vs splicing variants in the output
Entering edit mode
2.7 years ago

Hi there,

I've been using OMA standalone to identify orthologs and their evolution in my data set. Couple of species in my data set include splice variants, so as described in the manual, I included.splice files. Analyses went smoothly, used_splicing_variants.txt appeared in the output folder, and it looks fine.

But, when I analyzedHierarchicalGroups.orthoxml (or other .orthoxml files) I realized that each splice variant is encoded as a separate gene.

The .orthoxml file was used (with pyHam) to infer number of gene families with duplications, number of gained/lost genes, but now I'm not sure how to interpret these results. PyHam seems to interpret each splice variant as a separate gene and number of "gene gains" sum up to number of transcripts. Is there a way to use only one splice variant per gene in the PyHam analyses with OMA output?

Thanks in advance!

OMA orthologs OMA orthologs pyham • 775 views
Entering edit mode
2.7 years ago


currently pyham does not have an option to skip some gene elements from an orthoxml file. But we agree that this is an important feature and plan to implement it in the future.

As a short term fix, the easiest option in my view is to remove in the orthoxml file the <gene/> elements that do not correspond to a used splicing variant. None of these 'genes' is part of any HOG, so removing the gene elements is sufficient.

In case you do not care about the gene gains/losses in the terminal branches leading to the individual species, you can actually use the file directly. The minor variants will all appear as gene gains in the terminal branches.

Best wishes, Adrian

Entering edit mode

Hi Adrian, thank you for your response, it helps a lot.


Login before adding your answer.

Traffic: 1767 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6