Question: How to make ortholog table with Reciprocal Best Hit blast (Python?) ?
1
gravatar for jolespin
3.8 years ago by
jolespin130
United States
jolespin130 wrote:

I have 2 fasta files containing all of the proteins from 2 distance organisms (A Spirochaetes and a Firmicutes). I want to map the genes from the Firmicutes to it's best hit in the Spirochaetes.

What is the best way to do this and the most accepted way?

I'm very familiar with Python and my first thought was to use skbio and do a pairwise alignment for all of the proteins.(http://scikitbio.org/docs/0.4.1/generated/skbio.alignment.StripedSmithWaterman.html). However, since it's local alignment then it may give me a high score for a single domain which is not what I want.

I then thought about using BioPython and the blast wrapper but I don't know how to specify the query database and a length threshold (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc87).

blast protein ortholog gene genome • 3.2k views
ADD COMMENTlink modified 5 months ago by Biostar ♦♦ 20 • written 3.8 years ago by jolespin130
2
gravatar for Christophe Dessimoz
3.8 years ago by
University College London
Christophe Dessimoz660 wrote:

It may be more straightforward to infer your orthologs using our software package OMA standalone (http://omabrowser.org), which takes your two fasta files as input and produces all pairwise relationships.

If you want to combine your two genomes with publicly available data, you can also export precomputed OMA genomes at http://omabrowser.org/export

ADD COMMENTlink modified 3.0 years ago • written 3.8 years ago by Christophe Dessimoz660

I quickly checked onto OMA documentation, it looks promising. I'm wondering, have you compared your tool with the standard reciprocal blast approach? Why your tool is better? Besides, as i understand output of the tool is a table of pairwaise relations. Is there an option to directly extract all orthologs pairs from initial files (e.g. in FASTA format)?

ADD REPLYlink written 3.1 years ago by Denis190
1

Conceptually, the main limitation of reciprocal best hit is that it cannot cope with one-to-many or many-to-many orthology, which exist whenever a gene has duplicated after the speciation of interest. More discussion here:

https://academic.oup.com/gbe/article/5/10/1800/520875/Bidirectional-Best-Hits-Miss-Many-Orthologs-in

Now, if you are interested in the relative performance of different orthology inference methods, including OMA and reciprocal blast hit, please refer to this paper:

http://www.nature.com/nmeth/journal/v13/n5/full/nmeth.3830.html

ADD REPLYlink written 3.0 years ago by Christophe Dessimoz660
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1894 users visited in the last hour