Mining Ortholog Cliques

0

Entering edit mode

10.1 years ago

pld 5.1k

Does anyone have experience working with mining cliques from ortholog data? I have orthologs generated by best reciprocal blast from 22 species. It is 21 Ensembl species with one species we are annotating a de novo transcriptome for.

My goal is to measure the ratio of average distance in my 21 known species between the average distance of my de novo transcipt to know species. The idea is to get an idea of how conserved that gene is in general, and if my species closer or further away than one would expect. I think having cliques is the best way to do this.

I'm not against writing a script, but the problem is NP-complete and I have just just shy of 1 million transcripts. I'd have access to maybe 100 to 200 nodes with 24 cores each, but still this seems like a time to move away from python. Plus cliques are non-trivial to parallelize. I'm assuming there is software for this, but I'm worried that if the software is for generalized cliques it might be a larger pain to adapt to my needs. Does anyone have experience with mining cliques?

blast genes gene • 1.6k views

ADD COMMENT • link 10.1 years ago by pld 5.1k

Login before adding your answer.