Question: How to deal with one-to-many orthologies in PAML
gravatar for Solowars
2.9 years ago by
Brazil/Porto Alegre/UFRGS
Solowars60 wrote:

Hello everyone,

I want to perform a PAML analysis using codeml. For it I have an alignment containing protein-coding DNA sequences from a broad array of animals.

However, in some cases, gene orthology in my species is not 1-to-1 (i.e. Some animals have more than one ortholog, more than one sequence). The problem is that PAML accepts only one sequence per species, leaving me with the decision to choose among these multiple sequences.

I looked through PAML manual, and there is no orientation about this issue (which I believe, must be kind of common). I made some trees and distance matrices, but in some cases genes with multiple orthology are just "equally" far away from their respective orthologs.

Can you suggest any "best practice" to deal with this issue?

Thanks a lot!

paml orthology dn/ds • 785 views
ADD COMMENTlink modified 2.9 years ago by lieven.sterck10k • written 2.9 years ago by Solowars60

In general people focus on one-to-one orthologs (and skips one-to-many or many-to-many) in their analyses precisely to avoid this problem.

ADD REPLYlink written 2.9 years ago by Biojl1.7k

I read in several papers that they filter and keep genes only with one-to-one orthology, but I didn't think that this problem was so "unsurmountable".

ADD REPLYlink written 2.9 years ago by Solowars60
gravatar for lieven.sterck
2.9 years ago by
VIB, Ghent, Belgium
lieven.sterck10k wrote:

consider yourself still lucky; in the plant fields it's nearly all many-to-many relationships :(

From what I read you are already on the good track. What people usually do is to collect as much 'circumstantial evidence' (== the ensemble approach) as possible to support the choice for one of the orthologs. that can indeed be, phylo tree info, distance metrics, genomic location info, simple blast hits ... . In essence (and ideal case) you get enough of those to boil it down to a single gene but in reality you will often not!

here is a nice example of such an approach.

The question you asked is frequently also referred to as the "holy grail in bioinformatics" , so you likely can not expect a complete (or even any) answer.

If applicable you can of course (as suggested) only focus on the 1-to-1 orthologs to make your life easier

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by lieven.sterck10k

Thank you so much for your insightful answer. I've been thinking about this issue for some time, and it's good to know that I'm by no means alone. Again, thanks!

ADD REPLYlink written 2.9 years ago by Solowars60
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1462 users visited in the last hour