OMA browser( http://omabrowser.org/oma/home/) is very useful for my work of ortholog analysis. However, I face an issue regarding ID mapping. I searched the identifier mapping files but couldnt success. Specifically, how can I convert a big list of Oryza sativa uniprot id (for ex, Q10M50 ) to RGAP id (for ex, LOC_Os03g20700)? Thanks..!!
On the OMA browser there is no direct mapping between UniProtKB/TrEMBL accessions and RGAP ids. However, you can download two mapping files in the download section of the current release plant mapping and mapping to UniProt. By combining the two mappings based on the common OMA-ID you can establish a direct mapping from uniprot to rgap. In python you could do something like this to produce a mapping file from uniprot ids to the plant ids:
import csv import collections up2plant = collections.defaultdict(list) with open('oma-uniprot.txt', 'r') as up, open('oma-plants.txt', 'r') as plant, open('up-plants.txt', 'w') as out: up_reader = csv.reader((row for row in up if not row.startswith('#')), delimiter='\t') plant_reader = csv.reader((row for row in plant if not row.startswith('#')), delimiter='\t') out_writer = csv.writer(out, delimiter='\t') up2oma = collections.defaultdict(list) for row in up_reader: up2oma[row].append(row) for row in plant_reader: if row in up2oma: for up_id in up2oma[row]: out_writer.writerow([up_id, row])