Orthology
0
0
Entering edit mode
3 months ago
PhyloW • 0

I am doing a phylogenetic analyses of two sets of proteins (A and B) that are functionally very closely related and share a large degree of sequence similarity. I have identified from various species protein sequences I want to include in the analyses (for both proteins). Each separate sequence was included based on their similarity / BLAST results to the known and characterized (functionally) proteins (A and B) in Arabidopsis. I am worried that some of the species included might however represent paralogs and not orthologs. Is there any analyses where I can "plug and play" the data that I have and see whether it comes out as orthologs (hypothetically then an orthologous group for protein A and one for protein B). I do not want to do an analyses where I search a database for orthologs, I want to ID it in the sequences I already have in my dataset (which were included obviously based on certain pre selected criteria).

Paralogy Orthology BLAST • 354 views
ADD COMMENT
1
Entering edit mode

You can use OrthoMCL for this purpose.

ADD REPLY
0
Entering edit mode

Thank you for the answer. I have read up along similar lines, but was not quite sure whether it was the best approach. Will give it a try though.

ADD REPLY
1
Entering edit mode

I guess it would definitely help you.

Just to give a very brief introduction about how it works. It will take a set of protein sequences (let's say proteome from three different species) and perform homology-based clustering: first by running BLAST (for sequence similarity) and then clustering (using MCL program). Finally, it will predict the list of paralogous and orthologous proteins and stratify them.

ADD REPLY
0
Entering edit mode

Note that this is a shortcut which I would be wary of using in this case. Strictly speaking, from their very definition, paralogy and orthology can only be inferred from a phylogenetic tree. I would add the sequences to be tested to the relevant multiple sequence alignment and rebuild a phylogenetic tree from it then infer the relationships.

ADD REPLY
0
Entering edit mode

Perhaps I might also just mention: Not all the species / sequences we include might be from fully sequenced and annotated genomes. Is OrthoMCL not too "specialised" in that regard as it relies on these assumptions???

ADD REPLY

Login before adding your answer.

Traffic: 1824 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6