Hi all! I am hoping to get a file containing 1:1:1:1:1:1 orthologs across human, mouse, chicken, rat, rabbit, possum, and macaque. I used these BiomaRt options
But I have a problem.
In the CSV list that BiomaRt gave me, I have these annoying many-to-many relationships. I want each gene to only occur once throughout the species columns. Some genes however, are highly repeated. I used python to filter this list and got that only 1781 out of 1901282 lines are unique. Can someone try to give me advice on how to get orthologs across these species in a way where these annoying many-to-many relationships do not occur? I don't believe that there are 1781 genes that are orthologous to each other in these 7 species. plz help, what do I do? I am a lowly grad student with a phenomenal task. Pls help #wannacry
Have you looked at NCBI's homologene database? You can get more or less one to one orthologs from many species. Here is an example of GAPDH gene.