Question: Reverse Clustering?
1
gravatar for Eric Fournier
8.2 years ago by
Eric Fournier1.4k
Quebec, Canada
Eric Fournier1.4k wrote:

My apologies if this question's title is vague: I do now know how to label what I am trying to do, which has made my efforts at finding relevant litterature frustrating and unsuccessful.

I am analyzing the results of a two-color microarray hybridization experiment using the limma package. To validate the normalization methods I've been applying to the data, I am generating hierarchical clusters of individual channels of the microarrays to see if control and treatment samples cluster together. They do not; rather the green and red channels for each array cluster together, indicating that the array effect is more important than any other. My attempts at changing the normalization algorithms have proven unfruitful in correcting this problem.

My hypothesis for the moment is that the microarray I am using (which is of custom design) contains two classes of probes: one that is informative vis-à-vis the biological factor of interest, and another which contains nothing but noise due to a probe-design defect. What I'm looking for is some kind of algorithm/program which would take my microarray data and an a priori expected tree of samples and partitions the probes into those would support such a tree and those who would not. Is there such a thing, or at least something similar to it which II could use as a starting point for more research?

microarray clustering • 2.7k views
ADD COMMENTlink modified 5.0 years ago by Biostar ♦♦ 20 • written 8.2 years ago by Eric Fournier1.4k

Did you run a a dye-swap experiment to see if you can account for any dye bias?

ADD REPLYlink written 8.2 years ago by Steve Lianoglou5.1k

Yes, we are doing dye-swaps. All replicates are biological, and we are alternating the dyes we use for control and treatment.

ADD REPLYlink written 8.2 years ago by Eric Fournier1.4k
3
gravatar for Michael Dondrup
8.2 years ago by
Bergen, Norway
Michael Dondrup48k wrote:

With two color microarrays, channels are normally not analysed separately but instead as log ratio. That, because in spotted microarrays (that's a (out)dated technique anyways) the raw intensities capture mainly array effects, and therefore single-channel analysis is a no-go. You didn't tell us the technology platform but I guess it might be 'home-brew' arrays or agilent 2-color? In fact, your results are not surprising.

Instead, I would calculate normalized, background corrected log-channel ratios and use these values, making ratios should (in theory eliminate most of the array effects, as you see in your results by having channels clustered together). That can be done with limma as well, a similar question here: http://www.biostars.org/post/show/9372/limma-analysis-for-two-channeled-microarray-data-fetched-using-geoquery/

Here is also a step-by-step walkthrough: http://matticklab.com/index.php?title=Two_channel_analysis_of_Agilent_microarray_data_with_Limma

ADD COMMENTlink written 8.2 years ago by Michael Dondrup48k

Thank you for your answer, Michael.

We are using Agilent two-color microarrays. I am aware that such microarrays should be studied using log-ratios, and this is how we intend to study the biological effects of the treatment.

However, since the array is custom-designed, we have the possibility of replacing uninformative probes with better-performing ones in subsequent experiments that we will carry out. I am also under the impression that certain probes are more sensitive to the array-effect than others. It is in this perpective that I am trying to "cluster" probes into two categories: those who yield biologically meaningful data, and those where the array-effect is predominant, and who should be replaced in subsequent array designs.

However, when working on log-ratios, I am "cancelling out" the array effect, and thus cannot draw conclusions on its relative importance for each probe. This is why I was working on single channels, hoping that by first clustering the samples, then finding probes whose variation did not "fit" (For examples, probes whose response is always in the maximum range due to repeated elements), I could target such probes for replacement. I assumed this would work as I've generally had success clustering the various conditions of single-channel data through by-group analysis before, but not this time around.

ADD REPLYlink written 8.2 years ago by Eric Fournier1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1951 users visited in the last hour