Question: Problem in hierarchical clustering
gravatar for Calangoa
10 months ago by
Calangoa30 wrote:

Hi there, I have a problem with my hierarchical clustering method and I appreciate if anyone could help me in advance. Let me start from the first step, in order to identify differentially expressed genes in some microarray studies (each study consist of 3 individual dataset, collectively I have 15 dataset) I use limma package from bioconductor, R. Then I filtered out those genes with adj. P-value less than 0.05. After that, I extracted a set of genes which involved in the cell cycle for example. Finally, this set of genes with there expression base on log fold change were used for hierarchical clustering. As I read before for log-transformed data Euclidean distance measurement method with complete linkage is the best for my data but the problem is when I clustered 15 dataset, surprisingly data from the same study stand close together in one cluster. What can I do for this mistaken view? Would it possible to use only one control for all treatment data from a different study in R? Or another approach would be taken?

Many thanks in advance

clustring microarray • 431 views
ADD COMMENTlink modified 10 months ago by leaodel130 • written 10 months ago by Calangoa30

Can you show the design matrix, and especially if and how you checked and/or compensated for potential batch effects?

ADD REPLYlink written 10 months ago by ATpoint34k

Here is the photo of heirarchical clustring

I think my mistake is I dont consider the batch effect. I normiliza each study separetly then I clustered them together. How can I compensate batch effect? In what way? Would it a good idea to normiliza all datasets together? But I dont know how could it possible. Any suggestion?

ADD REPLYlink modified 10 months ago by RamRS27k • written 10 months ago by Calangoa30

Please edit this post and see the changes I've made to see how to add images properly.

Images should be added using the image button, not the link button. You'll need the direct link to the image, not the link to the page hosting the image.

ADD REPLYlink written 10 months ago by RamRS27k

If you normalize separately then this result is totally normal and expected as the datasets of the single studies are only scaled within the study but not to each other. If you do z-scoring then you at least have to normalize them all together, not discussing if comparing values from different studies makes sense due to the batch effect.

ADD REPLYlink written 10 months ago by ATpoint34k

I know, but I want to normalize them to compensate batch effect and to find which data is close to CM without considering what dataset is belong to which study. Any way?!

ADD REPLYlink written 10 months ago by Calangoa30
gravatar for leaodel
10 months ago by
leaodel130 wrote:

If you have a known batch effect and plan to visualize your data you'll need to correct the log-transformed data for this batch effect. I use limma::removeBatchEffect.

ADD COMMENTlink written 10 months ago by leaodel130

No I dont know, I just want to. After clustring I found that different daraset from one study stand close together in one cluster but it is not correct when they compared with CM data. How can I do metaanalysis and normilize microarray data from different study?

ADD REPLYlink written 10 months ago by Calangoa30

So a batch effect is not something that you'll correct by means of normalization. You have to use a method designed to measure the variance attributed to the batch variable and then correct for it. If you have a hidden batch effect you can use sva.

The sva package can be used to remove artifacts in three ways: (1) identifying and estimating surrogate variables for unknown sources of variation in high-throughput experiments (Leek and Storey 2007 PLoS Genetics,2008 PNAS), (2) directly removing known batch effects using ComBat (Johnson et al. 2007 Biostatistics) and (3) removing batch effects with known control probes (Leek 2014 biorXiv).

Once the batch effect is removed, you can proceed to the hierarchical clustering.

ADD REPLYlink written 10 months ago by leaodel130

Calangoa, if the answer was helpful to solve your problem, please accept it as an answer.

ADD REPLYlink written 10 months ago by leaodel130
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1422 users visited in the last hour