I'm looking for a way to search for the most conserved gene in two or more species. For example, if I input 'human' and 'pig', the output would be the top n most conserved gene between humans and pigs (either by nucleic acid or amino acid similarity).
Is there any tool that can do this? At the moment I'm reading UCSC's genome browser documentation to see if this could be done using it, but I'm not sure it could.
The biggest problem is that you have to have a clear definition of 'conserved genes' before proceeding.
First, you can align both proteins and dna sequences: so, you have two definition of conservation, whether two genes have similar protein sequence or similar dna sequence. Then, you can also define 'conservation' as having an high rate of synonymous changes compared to non.synonymous ones, and you can include splicing, gene regulation, expression, etc....
One of the approaches you can use is to use a statistics called omega which is the ratio between dN and dS (rates of synonymous/non synonymous changes) between the sequences of two proteins. You can go to ensembl/biomart, get the omega value for all genes with their orthologues, and then just sort and get the most conserved values.