Question

Gene Similarity Search

1

Entering edit mode

14.4 years ago

bow ▴ 790

I'm looking for a way to search for the most conserved gene in two or more species. For example, if I input 'human' and 'pig', the output would be the top n most conserved gene between humans and pigs (either by nucleic acid or amino acid similarity).

Is there any tool that can do this? At the moment I'm reading UCSC's genome browser documentation to see if this could be done using it, but I'm not sure it could.

conservation similarity • 3.2k views

ADD COMMENT • link updated 9 months ago by Ram 44k • written 14.4 years ago by bow ▴ 790

Ram · Answer 1 · 2010-04-16

2

Entering edit mode

14.4 years ago

Giovanni M Dall'Olio 28k

The biggest problem is that you have to have a clear definition of 'conserved genes' before proceeding. First, you can align both proteins and dna sequences: so, you have two definition of conservation, whether two genes have similar protein sequence or similar dna sequence. Then, you can also define 'conservation' as having an high rate of synonymous changes compared to non.synonymous ones, and you can include splicing, gene regulation, expression, etc....

One of the approaches you can use is to use a statistics called omega which is the ratio between dN and dS (rates of synonymous/non synonymous changes) between the sequences of two proteins. You can go to ensembl/biomart, get the omega value for all genes with their orthologues, and then just sort and get the most conserved values.

ADD COMMENT • link updated 9 months ago by Ram 44k • written 14.4 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

I guess for starters, I'll just be looking for the most similar protein sequences. So for example, I would want to know among all the orthologs between humans and zebrafish, which one has the most similar protein sequences.

I've never used biomart before, but thanks for giving me another tool to play with :).

ADD REPLY • link 14.4 years ago by bow ▴ 790

0

Entering edit mode

note that you can also get % of identity from biomart. Go to biomart, select "Ensembl Genes->Homo sapiens Genes" as dataset, and then in 'Attributes' click on "homologues" and on a species. You can select the values from there. Note that you can't get the omega value directly, you have to calculate dN/dS.

ADD REPLY • link 14.4 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

Thanks! I tried using filters -> 'orthologous x genes' only and it works, too.

ADD REPLY • link 14.4 years ago by bow ▴ 790

Ram · Answer 2 · 2010-04-16

1

Entering edit mode

14.4 years ago

Pierre Lindenbaum 163k

I guess you can find some interesting data in NCBI/Homologene: ftp://ftp.ncbi.nih.gov/pub/HomoloGene/current/homologene.xml.gz

ADD COMMENT • link updated 9 months ago by Ram 44k • written 14.4 years ago by Pierre Lindenbaum 163k

0

Entering edit mode

Couldn't download the whole thing :/, but I'm browsing the website now. The organism pool is still limited, but it's an ok start. Thanks!

ADD REPLY • link 14.4 years ago by bow ▴ 790