Question

Highly Expressed Genes

3

Entering edit mode

12.0 years ago

virag ▴ 30

Hi all,

I am looking into some bacterial genomes. For the moment though, it is OK to assume that I am dealing with just E coli. I want to get an idea of the genes which are highly expressed in E coli.

One of the most common techniques is to first of all identify a set of known highly expressed genes (which are generally ribosomal protein gene). Then use this reference set (ribosomal proteins) to get a codon usage table (CUT) for highly expressed genes. And then calculate Codon Adapatation Index (CAI) for each gene using the CUT generated in the previous step. And subsequently rank genes based on their CAI value This is a classical method which was first published by Paul Sharp and his colleagues in 1987. Since then people have come up with minor variants of CAI, but the principle essentially remains the same.

However, I was wondering if I could make use of some publicly available microarray data to get a list of highly expressed genes instead of using some theoretical measure like CAI. I do not know if it is really possible since microarrays are designed for relative studies implying that all the differentially expressed genes that are identified by microarrays show a relative overexpression or underexpression. i.e when a wild type is compared to say a drug treat condition, then all the genes which will be found to be overexpressed in wild type condition may not necessarily be highly expressed. But they could be highly expressed relative to their expression in, say drug treated state. So is there a way to use data from NCBI's GEO to get a list of experimentally determined highly expressed genes in wild type condition.

Thanks and regards Sankalp

gene expression microarray • 3.9k views

ADD COMMENT • link updated 12.0 years ago by seidel 11k • written 12.0 years ago by virag ▴ 30

0

Entering edit mode

Highly expressed in E coli in terms of quantity ? or are you looking at essential genes ?

ADD REPLY • link 12.0 years ago by Khader Shameer 18k

score 3 · Answer 1 · 2012-04-10

3

Entering edit mode

12.0 years ago

seidel 11k

In general there are two kinds of microarrays, those that are comparative and measure relative values as you describe, and those that measure absolute values of a given transcript based on hybridization intensity (such as arrays produced by affymetrix). Affymetrix produces an array for E. coli, so you could grab data from GEO that has been produced using those arrays and rank transcripts by their hybridization intensity - which is a general proxy for "abundance". I would imagine you could also find some RNA Seq data for various prokaryotes, and do something similar to find high abundance transcripts.

ADD COMMENT • link 12.0 years ago by seidel 11k

0

Entering edit mode

Thanks Seidel, I was not aware of microarrays that are used to get a measure of transcript abundance in absolute terms though the thought of using RNA-seq data did cross my mind at some stage. Thanks a lot.

ADD REPLY • link 12.0 years ago by virag ▴ 30

0

Entering edit mode

Microarrays provide only a crude measure of abundance, in general, but it might very well be better than using CAI.

ADD REPLY • link 12.0 years ago by Sean Davis 26k