How to make ortholog table with Reciprocal Best Hit blast (Python?) ?
1
1
Entering edit mode
5.0 years ago
jolespin ▴ 130

I have 2 fasta files containing all of the proteins from 2 distance organisms (A Spirochaetes and a Firmicutes). I want to map the genes from the Firmicutes to it's best hit in the Spirochaetes.

What is the best way to do this and the most accepted way?

I'm very familiar with Python and my first thought was to use skbio and do a pairwise alignment for all of the proteins.(http://scikitbio.org/docs/0.4.1/generated/skbio.alignment.StripedSmithWaterman.html). However, since it's local alignment then it may give me a high score for a single domain which is not what I want.

I then thought about using BioPython and the blast wrapper but I don't know how to specify the query database and a length threshold (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc87).

blast genome protein gene ortholog • 4.2k views
3
Entering edit mode
5.0 years ago

It may be more straightforward to infer your orthologs using our software package OMA standalone (http://omabrowser.org), which takes your two fasta files as input and produces all pairwise relationships.

If you want to combine your two genomes with publicly available data, you can also export precomputed OMA genomes at http://omabrowser.org/export

0
Entering edit mode

I quickly checked onto OMA documentation, it looks promising. I'm wondering, have you compared your tool with the standard reciprocal blast approach? Why your tool is better? Besides, as i understand output of the tool is a table of pairwaise relations. Is there an option to directly extract all orthologs pairs from initial files (e.g. in FASTA format)?

1
Entering edit mode

Conceptually, the main limitation of reciprocal best hit is that it cannot cope with one-to-many or many-to-many orthology, which exist whenever a gene has duplicated after the speciation of interest. More discussion here:

Now, if you are interested in the relative performance of different orthology inference methods, including OMA and reciprocal blast hit, please refer to this paper:

http://www.nature.com/nmeth/journal/v13/n5/full/nmeth.3830.html