Question: Normalization And Differential Expression From Array With Small Samples
gravatar for gundalav
6.7 years ago by
La La Land
gundalav290 wrote:

I have a dataset that looks like this (3 samples in total):

 mRNA     Cancer-Type-1   Cancer-Type-2  Normal
mRNA1      30        49    12
mRNA2     199        200   78
...        ...       ...  ....
mRNA1000   13        40    88

Hence we'd like to compare Cancer-1 with Normal and Cancer-2 with Normal. Since the number of samples are very small I wonder what's the best way to:

  1. Normalize the values, and;
  2. Identify which mRNA that are significantly up/down regulated when each cancers are compared with normal. This is typically problematic because, classification method like Hierarchical Clustering or KNN require large samples.
normalization microarray • 1.9k views
ADD COMMENTlink modified 6.7 years ago by vibhanim.2140 • written 6.7 years ago by gundalav290

What was the array platform? How have the data been processed so far? Was there a larger batch of samples from which these three were drawn that could be used for normalization? Or, did you really just run three arrays? The answers to those questions would affect normalization strategy. For sig up/down regulated genes you might not be able to do much better than sorting genes by fold-change. Any statistics you try to apply will probably be misleading and nothing would survive any kind of multiple testing correction.

ADD REPLYlink written 6.7 years ago by Obi Griffith18k

Thanks. The platform is "TORAY". No processing have been done, it is raw data, no larger batch samples and this is all there is. Advice on the best way to do normalization and DE will be greatly appreciated.

ADD REPLYlink written 6.7 years ago by gundalav290

How many biological replicates do you have for each condition?

ADD REPLYlink written 6.7 years ago by colinDotAIBN20
gravatar for Obi Griffith
6.7 years ago by
Obi Griffith18k
Washington University, St Louis, USA
Obi Griffith18k wrote:

I have zero experience with TORAY platform data (never even heard of it before). Google turns up surprisingly little. I would start by just plotting distributions of your raw data for each sample and maybe comparing MA plots to get a general sense of how the data look. You might consider quantiles normalization (e.g., the R package normalize.quantiles {preprocessCore}). To determine significantly up or down-regulated genes you can look at this paper for ideas. There are probably others. But honestly with just one sample versus two I think you can forget about statistics. I would just calculate fold changes and maybe bin according to absolute expression level so that you treat a 2/1 ratio with more skepticism than a 2000/1000 ratio.

ADD COMMENTlink modified 6.7 years ago • written 6.7 years ago by Obi Griffith18k
gravatar for vibhanim.21
6.7 years ago by
Manipal University
vibhanim.2140 wrote:

I have not worked on this platform before, but had done a similar experiment of comparing the 2 data sets. All I did was plotted the MA plots and the quantile normalization. performing t-test had helped a lot and was later compared.

ADD COMMENTlink written 6.7 years ago by vibhanim.2140
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2017 users visited in the last hour