Multiple OMA groups per uniprotID in OMAdb
1
1
Entering edit mode
2.0 years ago
chow ▴ 10

I am interested in using the OMA database to search for orthologs of my proteins, but I notice that a unique uniprot protein ID seems to map to multiple OMAdb IDs and therefore to different orthologous groups. I am wondering how I should proceed with this information as I expected a 1:1 correlation between uniprotID and OMA group.

For example, Gene AERPE00688 and AERPE00689 both correspond to uniprot accession number P58026 (entry name RL34-AERPE), but the former corresponds to OMA group 1003628 whilst the latter corresponds to 1003636. I was wondering if you could provide some insight on what this signifies and how I should proceed, if I would want to choose one OMA group to work with. I think I am probably missing a key concept here, apologies for that!

OMA orthologs • 509 views
ADD COMMENT
2
Entering edit mode
2.0 years ago

Hi chow,

while it is generally true that there is a 1:1 mapping between uniprot ids and the protein ids in OMA, this is not a hard constrain. The case you mention here with Aeropyrum pernix proteins is that we use a slightly older genome version than what is now available. At that time, the annotations contained two peotein sequences that are almost identical, one mapping to P58026, the other to Q05E33. The later sequence has been merged by UniProtKB into the first one.

Our mapping with uniprot is based on sequence identity, allowing for a few missing amino acids, and that's why the mapping in the end is not always guarantied to be 1:1. It is still rather stringent and we more often have missing uniprot accessions because of differences in the underlying gene model than the opposite. so there is always a trade off.

If you search for orthologs, it's likely that you should consider our HOGs, not the OMA Groups. The OMA Groups are single protein groups, where every sequence is orthologous to all other sequences in that group, which by definition disallows two proteins from the same species to be in one group. The HOGs however are groups that contain all proteins that started evolving from a single ancestral gene at a specific taxonomic range, including also inparalogs that started evolving through later duplication events. There, if the sequences are so similar they usually are put together in the same HOG.

I hope this helps, otherwise feel free to get back to us.

Best wishes Adrian

ADD COMMENT

Login before adding your answer.

Traffic: 1393 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6