Question: Fold Change Detection In Affymetrix Microarray Data Without Replicates
gravatar for elb
8.4 years ago by
elb200 wrote:

Dear users, I have a question regarding microarray data analysis (Affymetrix one color). My point is that I have just 1 sample TREATMENT and 1 sample REFERENCE. Neither technical replicates nor biological replicates are available. A statistical test to find differentially expressed genes between the two conditions seems to me impossible (even the simple t-test) due to the absence of replicates. People who asked me to do the analysis were interested only in finding the genes changing between the two conditions. In this conditions, in my opinion only the fold change is possible just to give a general view of the behavior of the genes. Any other suggestion about this issue?

Thanks a lot


R bioconductor microarray • 4.1k views
ADD COMMENTlink modified 8.4 years ago • written 8.4 years ago by elb200
gravatar for Istvan Albert
8.4 years ago by
Istvan Albert ♦♦ 86k
University Park, USA
Istvan Albert ♦♦ 86k wrote:

As Will put put there is only so much you can say about such data.

Perhaps you can find a subset of genes that are know to be unaffected by the treatment then use those to build the empirical distribution of the expected variations. Then use that distribution to estimate to likelihood of observing a certain difference of expression in the remainder of the genes. This is still weak since it will be strong affected by errors.

ADD COMMENTlink modified 8.4 years ago • written 8.4 years ago by Istvan Albert ♦♦ 86k

I agree with Istavan's suggestion. You can pick few of the house keeping genes. You can use a list of genes that appear as house keeping in different studies and use them to calculate the expected variations.

ADD REPLYlink written 8.4 years ago by Ashutosh Pandey12k

Good idea. I'm interested in the technical details of such a procedure. Do you have any reference in hand?

ADD REPLYlink written 8.4 years ago by Woa2.8k

you will probably be hard pressed to find a reference on how to do statistical analysis with no replicates.

what is described is just basic statistics, create a histogram, fit it with a normal, or just look at the percent of values within a certain distance, that will give you the probability of observing that difference, then look at how many you actually observe relative to the how many you would expect by your empirical distribution

ADD REPLYlink modified 8.4 years ago by Istvan Albert ♦♦ 86k • written 8.4 years ago by Biostar User1.0k

Sorry I dont have a reference. But there are some normalization methods that involve using house keeping genes:

OR you can just use housekeeping genes expression to get an idea about how much variation is observed between the two datasets for them and come up with a threshold for the fold change between genes to be differentially expressed but I dont think at any point you can calculate p-values.

ADD REPLYlink modified 8.4 years ago by Istvan Albert ♦♦ 86k • written 8.4 years ago by Ashutosh Pandey12k
gravatar for Obi Griffith
8.4 years ago by
Obi Griffith19k
Washington University, St Louis, USA
Obi Griffith19k wrote:

Here are a few possibilities that you might consider:

  1. Some kind of outlier analysis on the difference (or fold-change) between treatment and reference. You might hope that important biological differences would stand out compared to differences arising from just sample-to-sample variation.
  2. You could pretend that these are RNA-seq samples and concert to count-based data. People do statistics (e.g., Fisher Exact) on a single sample vs single sample in the RNAseq field all the time.
  3. You could calculate the change gene rank in treatment vs reference. Then you could set up a random permutation test where you randomly assign ranks to genes by drawing from the reference and treatment and see if there were any unusually "lucky" jumps in rank in the actual data compared to random simulations.

Just to be clear. These are all really bad options. Trying to come up with statistics with no replicates will likely just get you smacked down by a reviewer if you ever attempt to publish. Your original idea to just use fold-changes is probably best. Although I would also consider the relative expression of genes. You might put more weight on a gene with FC=2 if it went from 10,000 to 20,000 than if it went from 1 to 2. Any candidates you identify will have to be validated before they are worth anything at all. The suggestion #1 above could also help you identify fold-change values that really stand out. Good luck!

ADD COMMENTlink written 8.4 years ago by Obi Griffith19k
gravatar for Federico Giorgi
8.4 years ago by
Columbia University
Federico Giorgi630 wrote:

In your case, I fear you can assess your data by fold change only (there are also solutions like PUMA), but the whole lack of replicates is not necessarily a blocking wall. Simple fold changes can be complemented with functional analysis of the most-changed genes, comparison with known contrasts (e.g. FARO server), mapping of the changed genes over known pathways (KEGG, MapMan).

Generally, if the biological response is strong enough, you will see it. Regardless of replicates, regardless of p-values. It will make sense just by putting enough thoughts and time on it.

Good luck!

ADD COMMENTlink modified 8.3 years ago • written 8.4 years ago by Federico Giorgi630
gravatar for Will
8.4 years ago by
United States
Will4.5k wrote:

You're going to be hard-pressed to squeeze any results out of such a small dataset. Can you supplement your data with publicly available datasets from repositories like GEO or ArrayExpress?

If not then you're limited to fold-change. You can rank your results by fold-change but you'll get a lot of false positives.

ADD COMMENTlink written 8.4 years ago by Will4.5k
gravatar for elb
8.3 years ago by
elb200 wrote:

Hi guys, Thank you very much for your precious suggestions. What we done is to use some genes as "housekeeping" genes, even if we were not totally sure that genes were unaffected by the treatment. Anyway this is the best we done in that conditions. Thank you again!

ADD COMMENTlink written 8.3 years ago by elb200
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2283 users visited in the last hour