Question: How to make ortholog table with Reciprocal Best Hit blast (Python?) ?
1
gravatar for jolespin
2.8 years ago by
jolespin120
United States
jolespin120 wrote:

I have 2 fasta files containing all of the proteins from 2 distance organisms (A Spirochaetes and a Firmicutes). I want to map the genes from the Firmicutes to it's best hit in the Spirochaetes.

What is the best way to do this and the most accepted way?

I'm very familiar with Python and my first thought was to use skbio and do a pairwise alignment for all of the proteins.(http://scikitbio.org/docs/0.4.1/generated/skbio.alignment.StripedSmithWaterman.html). However, since it's local alignment then it may give me a high score for a single domain which is not what I want.

I then thought about using BioPython and the blast wrapper but I don't know how to specify the query database and a length threshold (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc87).

blast protein ortholog gene genome • 2.3k views
ADD COMMENTlink modified 2.8 years ago by Christophe Dessimoz540 • written 2.8 years ago by jolespin120
2
gravatar for Christophe Dessimoz
2.8 years ago by
University College London
Christophe Dessimoz540 wrote:

It may be more straightforward to infer your orthologs using our software package OMA standalone (http://omabrowser.org), which takes your two fasta files as input and produces all pairwise relationships.

If you want to combine your two genomes with publicly available data, you can also export precomputed OMA genomes at http://omabrowser.org/export

ADD COMMENTlink modified 24 months ago • written 2.8 years ago by Christophe Dessimoz540

I quickly checked onto OMA documentation, it looks promising. I'm wondering, have you compared your tool with the standard reciprocal blast approach? Why your tool is better? Besides, as i understand output of the tool is a table of pairwaise relations. Is there an option to directly extract all orthologs pairs from initial files (e.g. in FASTA format)?

ADD REPLYlink written 2.1 years ago by Denis100
1

Conceptually, the main limitation of reciprocal best hit is that it cannot cope with one-to-many or many-to-many orthology, which exist whenever a gene has duplicated after the speciation of interest. More discussion here:

https://academic.oup.com/gbe/article/5/10/1800/520875/Bidirectional-Best-Hits-Miss-Many-Orthologs-in

Now, if you are interested in the relative performance of different orthology inference methods, including OMA and reciprocal blast hit, please refer to this paper:

http://www.nature.com/nmeth/journal/v13/n5/full/nmeth.3830.html

ADD REPLYlink written 24 months ago by Christophe Dessimoz540
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1561 users visited in the last hour