After the Intramodular analysis: identifying genes with high GS and MM you do summary output of network analysis results. When I use the code: names(datExpr) and names(datExpr)[moduleColors=="brown"] I get NULL for both. Which is impossible because I do get the module membership vs gene significance scatterplot with a specific module (in the tutorial they use brown). Can someone please help me to understand what I do wrong and how I can fix this. Without this I do not know which genes belong to the specific colourmodule I am interested in. When I run the tutorial I do see the probe ID’s but with my data set I only get Null.
This is the code that I used. I get the Module trait relationship heatmap, and the Module membership vs. gene significance for specific color modules. The very last steps do not work.
setwd("C:/Users/Jolet/Dropbox/RNA-seq/Colon/analysis/WGCNA") library(WGCNA) options(stringsAsFactors = FALSE); lnames = load(file = "Colon-dataInput.RData"); lnames lnames = load(file = "Colon-networkConstruction-auto.RData"); lnames
nGenes = ncol(datExpr); nSamples = nrow(datExpr); MEs0 = moduleEigengenes(datExpr, moduleColors)$eigengenes MEs = orderMEs(MEs0) moduleTraitCor = cor(MEs, datTraits, use = "p"); moduleTraitPvalue = corPvalueStudent(moduleTraitCor, nSamples);
sizeGrWindow(10,6) textMatrix = paste(signif(moduleTraitCor, 2), "\n(", signif(moduleTraitPvalue, 1), ")", sep = ""); dim(textMatrix) = dim(moduleTraitCor) par(mar = c(6, 8.5, 3, 3)); labeledHeatmap(Matrix = moduleTraitCor, xLabels = names(datTraits), yLabels = names(MEs), ySymbols = names(MEs), colorLabels = FALSE, colors = blueWhiteRed (50), textMatrix = textMatrix, setStdMargins = FALSE, cex.text = 0.5, zlim = c(-1,1), main = paste("Module-trait relationships"))
SAA = as.data.frame(datTraits$SAA); names(SAA) = "SAA" modNames = substring(names(MEs), 3)
geneModuleMembership = as.data.frame(cor(datExpr, MEs, use = "p")); MMPvalue = as.data.frame(corPvalueStudent(as.matrix(geneModuleMembership), nSamples));
names(geneModuleMembership) = paste("MM", modNames, sep=""); names(MMPvalue) = paste("p.MM", modNames, sep="");
geneTraitSignificance = as.data.frame(cor(datExpr, SAA, use = "p")); GSPvalue = as.data.frame(corPvalueStudent(as.matrix(geneTraitSignificance), nSamples));
names(geneTraitSignificance) = paste("GS.", names(SAA), sep=""); names(GSPvalue) = paste("p.GS.", names(SAA), sep="");
module = "brown" column = match(module, modNames); moduleGenes = moduleColors==module;
sizeGrWindow(7, 7); par(mfrow = c(1,1)); verboseScatterplot(abs(geneModuleMembership[moduleGenes, column]), abs(geneTraitSignificance[moduleGenes, 1]), xlab = paste("Module Membership in", module, "module"), ylab = "Gene significance for body SAA", main = paste("Module membership vs. gene significance\n"), cex.main = 1.2, cex.lab = 1.2, cex.axis = 1.2, col = module) write.csv(textMatrix,file="textMatrix.csv") probes = names(datExpr) geneInfo = data.frame(ID = probes, moduleColor = moduleColors, geneTraitSignificance, GSPvalue) write.csv(geneInfo, file = "geneInfo.csv") corblack = moduleTraitCor[which(row.names(moduleTraitCor)=="MEblack"),] head(corblack[order(-corblack)])
names(datExpr)
names(datExpr)[moduleColors=="brown"]
Hi joletk,
I thought you were running one of the tutorial, but that chunk of code you just posted does not help me to understand where the error is. I guess that
names(datExpr)
give youNull
, therefore I would need the chunk of code that you used to load the data and define datExpr object.I did follow all the steps of the tutorial and it does work. but when i use my own data i get Null at the very last part of the code like stated above. all of the other steps work. would you like me to post all the code from the beginning?
Yes, please. The chunk of code used to prepare the object datExpr
Display the current working directory
setwd("C:/Users/Jolet/Dropbox/RNA-seq/Colon/analysis/WGCNA")
Load the WGCNA package
library(WGCNA);
The following setting is important, do not omit.
options(stringsAsFactors = FALSE);
Read in the female liver data set
femData = read.csv("GenesC.csv");
Take a quick look at what is in the data set:
dim(femData); names(femData);
transpose data to not include other colums only genes
datExpr0 = as.data.frame(t(femData[, -c(1:1)])); names(datExpr0) = femData$substanceBXH; rownames(datExpr0) = names(femData)[-c(1:1)];
check for samples with too many missing values
gsg = goodSamplesGenes(datExpr0, verbose = 3); gsg$allOK
only perform this step if last statement were false
if (!gsg$allOK) { # Optionally, print the gene and sample names that were removed: if (sum(!gsg$goodGenes)>0) printFlush(paste("Removing genes:", paste(names(datExpr0)[!gsg$goodGenes], collapse = ", "))); if (sum(!gsg$goodSamples)>0) printFlush(paste("Removing samples:", paste(rownames(datExpr0)[!gsg$goodSamples], collapse = ", "))); # Remove the offending genes and samples from the data: datExpr0 = datExpr0[gsg$goodSamples, gsg$goodGenes] }
check for outliers
sampleTree = hclust(dist(datExpr0), method = "average");
Plot the sample tree: Open a graphic output window of size 12 by 9 inches
The user should change the dimensions if the window is too large or too small.
sizeGrWindow(12,9)
pdf(file = "Plots/sampleClustering.pdf", width = 12, height = 9);
par(cex = 0.6); par(mar = c(0,4,2,0)) plot(sampleTree, main = "Sample clustering to detect outliers", sub="", xlab="", cex.lab = 1.5, cex.axis = 1.5, cex.main = 2)
if outliers excists cut it
you have to pick the cut value in this case it is fifteen
Plot a line to show the cut
abline(h = 90000, col = "red");
Determine cluster under the line
clust = cutreeStatic(sampleTree, cutHeight = 90000, minSize = 10) table(clust)
clust 1 contains the samples we want to keep.
keepSamples = (clust==1) datExpr = datExpr0[keepSamples, ] nGenes = ncol(datExpr) nSamples = nrow(datExpr)
load trait data
traitData = read.csv("TraitsC.csv"); dim(traitData) names(traitData)
allTraits = traitData[, -1]; dim(allTraits) names(allTraits) femaleSamples = rownames(datExpr); traitRows = match(femaleSamples, allTraits$Calf); datTraits = allTraits[traitRows, -1]; rownames(datTraits) = allTraits[traitRows, 1]; collectGarbage();
Re-cluster samples
sampleTree2 = hclust(dist(datExpr), method = "average")
Convert traits to a color representation: white means low, red means high, grey means missing entry
traitColors = numbers2colors(datTraits, signed = FALSE);
Plot the sample dendrogram and the colors underneath.
plotDendroAndColors(sampleTree2, traitColors, groupLabels = names(datTraits), main = "Sample dendrogram and trait heatmap")
save(datExpr, datTraits, file = "Colon-dataInput.RData")
Hi joletk
in:
I do not understand the
-c(1:1)
. If you do not have have any outlier, simply run:Thank you!!! I got it to work.
Happy to hear that!!