Question: How Many Genes Differentially Expressed In Microarray Can Be Seen As Normal?
gravatar for Cheng Zhongshan
10.4 years ago by
Cheng Zhongshan400 wrote:

Hi, I have a time course (0h,24h,48h,72h,96h,144h after sexual stage induction) microarray datasets about Gibberella zeae, a plant pathogen, in which about 14000 coding protein sequences,after analyzing microarray with SAS proc mixed procedure, I find about 5000 genes differetially expressed in total of these time course, is it normal? I really hope somebody can give me some suggestion. Thanks.

microarray • 3.6k views
ADD COMMENTlink modified 9.7 years ago by Stefano Berri4.1k • written 10.4 years ago by Cheng Zhongshan400

When considering this question, you must keep in mind that many of the methods for normalization of microarray data assume that only a small fraction of genes are differentially expressed. So even if a large fraction of genes actually exhibit differential expression, your analysis pipeline might not handle this data well, and you might get unpredictable or nonsensical results.

ADD REPLYlink written 10.3 years ago by Ryan Thompson3.4k
gravatar for Stefano Berri
10.4 years ago by
Stefano Berri4.1k
Cambridge, UK
Stefano Berri4.1k wrote:

Last year a paper suggested that nearly all genes are transcriptionally regulated during plant infection.

I think this might actually be the case for all organisms. When something happen the whole transcriptome is slightly regulated. Some genes have drammatic change, the other simply "adjust" to the new "state".

The fact is that, usually, you can show that only a few genes are regulated because to pass a statistical test you need either a big shift in mean expression value or many many replicates. And given the cost of microarray, the latter is rarely possible, so you end up "seeing" only those that have big swings in gene expression. Furthermore you need to correct for multiple testing, and to make sure you don't have too many false positive, you end up having many false negatives.

The above mentioned paper had 72 (!) biological replicates because it was the collection of all "controls" of a massive experiment and so their statistics is very powerful.

If you have many replicates and/or the biological replicates are very homogeneous, you might find many genes that result regulated.

ADD COMMENTlink written 10.4 years ago by Stefano Berri4.1k

Ok, thanks very much. Your suggestions are really helpful to me. In fact, my microarray only have four replication, in this way, there would be much noise among them, through I used FDR and fold chang >2 as a cutoff value. All of this procedures are carried out with SAS proc mixed. I will try to use other methods to reanalyze my data, maybe by comparsion, I can avoid some big mistakes.

ADD REPLYlink written 10.4 years ago by Cheng Zhongshan400

Great info - 72 replicates, that's very impressive and produced some eye opening results

ADD REPLYlink written 10.4 years ago by Istvan Albert ♦♦ 85k
gravatar for Daniel Swan
10.4 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

There's no metric for a 'normal' amount of genes differentially expressed in a microarray experiment, this is going to vary massively depending on your experimental conditions. I've seen very well replicated experiments that have 1000's of differentially expressed genes detectable in a very robust fashion, other very targeted experiments (siRNA knockdowns) in which only a handful of genes are perturbed.

Given that you're reporting a number of genes differentially expressed 'in total of the time course' maybe you should be looking at changes between the timepoints as well as across the whole experiment?

The real issue is that dissecting a gene list 5000 genes long to get any more meaningful information is a bit more of a challenge than dissecting one 500 genes long.

ADD COMMENTlink written 10.4 years ago by Daniel Swan13k

Actually, I use the 0h as a control, and make other treatments compared with it, further, I use a perl script to find the intersect part of these treatments. For example, 24h as A, 48h as B, and so on, I can have subsets like the following: A,AB,AC,AD,AE,ABC,ABD,ABE,CDE,.....ABCDE,is this making any sense?

ADD REPLYlink written 10.4 years ago by Cheng Zhongshan400

I'd be more tempted to use something other than a perl script for doing venn diagrams. I'd seriously consider using something that allows you to set up meaningful contrasts (Limma in BioConductor for instance) to analyse this data. There are plenty of time-course specific packages for analysing time-course data. MaSigPro comes to mind as well: (that reference should prove to be interesting regardless of whether or not you use the methodology)

ADD REPLYlink written 10.4 years ago by Daniel Swan13k

Thanks. I will follow your suggestion and reanalyze my microarray data.

ADD REPLYlink written 10.4 years ago by Cheng Zhongshan400
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1189 users visited in the last hour