Question: WGCNA analysis in normal and tumor case
gravatar for shivangi.agarwal800
2.3 years ago by
shivangi.agarwal80070 wrote:


I want to perform WGCNA analysis, i am having RNA seq data (gene expression) in normal and tumor cases. I want to compare between control and tumor case with respect to modules, module-trait relationship etc. How to input control and tumor data to WGCNA?. Can I upload them by keeping in same excel sheet.? But then how it would recognize them. Please suggest. Thanks in advance.m

modules wgcna • 1.9k views
ADD COMMENTlink modified 2.3 years ago by Kevin Blighe69k • written 2.3 years ago by shivangi.agarwal80070
gravatar for Kevin Blighe
2.3 years ago by
Kevin Blighe69k
Republic of Ireland
Kevin Blighe69k wrote:

You indicate that your data is currently stored in an Excel sheet, which is not typical. From where did you obtain it? Processed RNA-seq data is usually stored in a CSV or TSV file, and may be tarred and zipped and have the extension *.tar.gz. Note that storing and re-using data from an Excel sheet is not good practice. Excel adds a lot of formatting on top of your data and can even modify / manipulate it in ways that you may not anticipate.

I suggest that you go through the entire WGCNA Tutorial and then apply the methodology to your own data.

With tumour-normal data, you may consider one of the following:

  • run WGCNA separately for each of tumour and normal and then compare results
  • run WGCNA with tumour + normal together and then see which modules (and, from this, which genes contained within those modules) correlate with tumour-normal status.

If you honestly complete the online tutorial even once, you will have a better understanding. That's why the tutorial exists.


ADD COMMENTlink written 2.3 years ago by Kevin Blighe69k

Ya, I meant to say that only. My data is arranged in .tsv format containing gene name in column and sample id in rows. My concern was that if we want to study the difference between two groups (control and tumor) then should we input data containing both groups or we input them separately. But as per your second suggestion (run WGCNA with tumour + normal together and then see which modules (and, from this, which genes contained within those modules) correlate with tumour-normal status), if we run them together, how to compare results between two groups. Thanks in advance.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by shivangi.agarwal80070

When you run WGCNA with tumour + normal together, the procedure after this is:

  1. for each module's eigenvalues, correlate (cor and cor.test) or regress (lm or glm) these to TumourNormal status (encoded as 1 and 0)
  2. identify modules that are statistically significant from point number 1
  3. perform literature search or pathway analysis on genes in the modules identified in point number 2
ADD REPLYlink written 2.3 years ago by Kevin Blighe69k

Ok thanks, I am looking into some papers. But can you elaborate more point no.1. "for each module's eigenvalues, correlate (cor and cor.test) or regress (lm or glm) these to TumourNormal status (encoded as 1 and 0)"

ADD REPLYlink written 2.3 years ago by shivangi.agarwal80070

Hey, I meant to generate a plot like this, where the modules are correlated to your traits / phenotypes:

While correlating to 1 and 0 is not ideal, it can still be performed. There is a function in WGCNA to generate that plot but I canot find it right now. I have a similar function here, if you need it:



A regression analysis of the modules to TumourStatus is also possible. For example:

glm(TumourNormalStatus ~ blueModule, data=data, family=binomial(link='logit'))
glm(TumourNormalStatus ~ pinkModule, data=data, family=binomial(link='logit'))
et cetera
ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Kevin Blighe69k

Ya, this module-trait relationship plot, I have got. But to perform regression analysis of module to tumorstatus, I am giving command like, but getting error

glm(TumourNormalStatus ~ darkgreen, family=binomial(link='logit'))

Error in eval(predvars, data, env) : object 'TumourNormalStatus' not found

I dont know how to define tumornormalstatus. Can you help me to define this?. Lots of thanks. I have used codes as:

    mydata = read.csv("up_down_expression_all_samples3_only_normal.csv", sep = "\t");
datExpr0 =[, -c(1)]));
names(datExpr0) = mydata$GENE;
rownames(datExpr0) = names(mydata)[-c(1)];
gsg = goodSamplesGenes(datExpr0, verbose = 3);
sampleTree = hclust(dist(datExpr0), method = "average");
clust = cutreeStatic(sampleTree, minSize = 10)
 datExpr = datExpr0[keepSamples, ]
nGenes = ncol(datExpr)
nSamples = nrow(datExpr)
traitData = read.csv("trait2.csv");
allTraits = traitData[,];
femaleSamples = rownames(datExpr);
traitRows = match(femaleSamples, allTraits$CaseID);
datTraits = allTraits[traitRows,-1];
rownames(datTraits) = allTraits[traitRows,1];
sampleTree2 = hclust(dist(datExpr), method = "average")
traitColors = numbers2colors(datTraits, signed = FALSE);
powers = c(c(1:10), seq(from = 12, to=20, by=2))
sft = pickSoftThreshold(datExpr, powerVector = powers, verbose = 5)
text(sft$fitIndices[,1], -sign(sft$fitIndices[,3])*sft$fitIndices[,2],
text(sft$fitIndices[,1], sft$fitIndices[,5], labels=powers, cex=cex1,col="red")
softPower = 14;
adjacency = adjacency(datExpr, power = softPower);
TOM = TOMsimilarity(adjacency);
dissTOM = 1-TOM
geneTree = hclust(as.dist(dissTOM), method = "average");
minModuleSize = 30;
dynamicMods = cutreeDynamic(dendro = geneTree, distM = dissTOM,
                            deepSplit = 2, pamRespectsDendro = FALSE,
                            minClusterSize = minModuleSize);
mods<- table(dynamicMods)
write.csv (mods, "modules.csv")
dynamicColors = labels2colors(dynamicMods)
MEList = moduleEigengenes(datExpr, colors = dynamicColors)
MEs = MEList$eigengenes
MEDiss = 1-cor(MEs);
METree = hclust(as.dist(MEDiss), method = "average");
MEDissThres = 0.25
abline(h=MEDissThres, col = "red")
merge = mergeCloseModules(datExpr, dynamicColors, cutHeight = MEDissThres, verbose = 3)
mergedColors = merge$colors;
mergedMEs = merge$newMEs;
moduleColors = mergedColors
colorOrder = c("grey", standardColors(50));
moduleLabels = match(moduleColors, colorOrder)-1;
MEs = mergedMEs;
nGenes = ncol(datExpr);
nSamples = nrow(datExpr);
MEs0 = moduleEigengenes(datExpr, moduleColors)$eigengenes
MEs = orderMEs(MEs0)
moduleTraitCor = cor(MEs, datTraits, use = "p");
moduleTraitPvalue = corPvalueStudent(moduleTraitCor, nSamples);
textMatrix =  paste(signif(moduleTraitCor, 2), "\n(",
                    signif(moduleTraitPvalue, 1), ")", sep = "");
dim(textMatrix) = dim(moduleTraitCor)
par(mar = c(6, 8.5, 3, 3));
# Display the correlation values within a heatmap plot
labeledHeatmap(Matrix = moduleTraitCor,
               xLabels = names(datTraits),
               yLabels = names(MEs),
               ySymbols = names(MEs),
               colorLabels = FALSE,
               colors = greenWhiteRed(50),
               textMatrix = textMatrix,
               setStdMargins = FALSE,
               cex.text = 0.5,
               zlim = c(-1,1),
               main = paste("Module-trait relationships"))
Menopausestatus =$Menopausestatus);

names(Menopausestatus) = "Menopausestatus"
modNames = substring(names(MEs), 3)

geneModuleMembership =, MEs, use = "p"));
MMPvalue =, nSamples));

names(geneModuleMembership) = paste("MM", modNames, sep="");
names(MMPvalue) = paste("p.MM", modNames, sep="");

geneTraitSignificance =, Menopausestatus, use = "p"));
GSPvalue =, nSamples));

names(geneTraitSignificance) = paste("GS.", names(Menopausestatus), sep="");
names(GSPvalue) = paste("p.GS.", names(Menopausestatus), sep="");
module = "pink"
column = match(module, modNames);
moduleGenes = moduleColors==module;
ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by shivangi.agarwal80070

you can just add 0 or 1 as tumour status

ADD REPLYlink written 16 months ago by krushnach80870

old question but its more related to R ,I get module plus the tumour status as one data frame , " data=data" this is the total dataframe .isnt it?

ADD REPLYlink written 13 months ago by krushnach80870
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1088 users visited in the last hour