Question: I have two questions of spike-in data sets for microarray experiments
gravatar for Avro
4.8 years ago by
Avro140 wrote:

Hi everyone,

I am reading a book called "Statistics and Data Analysis for Microarrays Using R and Bioconductor". More specifically, I am looking at the limitations of microarrays, and I don't understand this sentence:

"The variance of average chip intensity among spike-in data sets is much lower than those measured in most real-life data sets, casting doubts on the general applicability of these data for developing analytical tools  for highly diverse clinical expression profiles."

I have two questions:

1) I understand that spike-in data sets are control that you include in your sample preparation, but how do they work when you analyze/transform your data? 

2) What does the author mean by "the variance of average chip intensity among the spike-in data sets"? I know what variance is. For example, if I have 42 control genes, do I compute the average intensity for all of them for each array and then compute the variance?

Thank you!



microarray • 1.6k views
ADD COMMENTlink modified 4.8 years ago by JC10k • written 4.8 years ago by Avro140
gravatar for JC
4.8 years ago by
JC10k wrote:

1) Spike-in sequences are used to scale properly the intensities among chips. Suppose you have 1 spike-in gene in 2 chips, if one chip have an expression level for this gene as 100 and the second chip as 200, you can scale all values in chip 1 doubling the value or in chip 2 by halves. Of course you have more than one sequence in different concentration, therefore you can adjust your intensity values distributions properly using more sophisticated methods.  

2) Yes. But the point is that Spike-in sequences have lower variance than the real genes in your samples, so they are not useful.

ADD COMMENTlink written 4.8 years ago by JC10k

Hi! Thank you for your answers! So, this is done so we can compare mRNA expression between different platforms/conditions. It's a all about normalization. I'm sorry, but I don't understand your last sentence about using different concentrations of the same sequence.

So, these spike-ins are only good for normalizing since they do not reflect the full spectrum of gene expression variability, right? 

Thank you very much! 


ADD REPLYlink written 4.8 years ago by Avro140

No, you have several sequences (each one different) with several known concentrations as Spike-Ins.

And yes, they are good for normalization between samples, real sequences can be more variable.

ADD REPLYlink written 4.8 years ago by JC10k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 918 users visited in the last hour