WGCNA summary output of network analysis results Probe ID's not working
1
0
Entering edit mode
4.3 years ago
joletk • 0

After the Intramodular analysis: identifying genes with high GS and MM you do summary output of network analysis results. When I use the code: names(datExpr) and names(datExpr)[moduleColors=="brown"] I get NULL for both. Which is impossible because I do get the module membership vs gene significance scatterplot with a specific module (in the tutorial they use brown). Can someone please help me to understand what I do wrong and how I can fix this. Without this I do not know which genes belong to the specific colourmodule I am interested in. When I run the tutorial I do see the probe ID’s but with my data set I only get Null.

rna-seq output • 2.7k views
ADD COMMENT
0
Entering edit mode
4.3 years ago

Hi,

what is the output of names(datExpr)?

also did you run names(datExpr) = femData$substanceBXH ?

ADD COMMENT
0
Entering edit mode

This is the code that I used. I get the Module trait relationship heatmap, and the Module membership vs. gene significance for specific color modules. The very last steps do not work.

setwd("C:/Users/Jolet/Dropbox/RNA-seq/Colon/analysis/WGCNA") library(WGCNA) options(stringsAsFactors = FALSE); lnames = load(file = "Colon-dataInput.RData"); lnames lnames = load(file = "Colon-networkConstruction-auto.RData"); lnames

nGenes = ncol(datExpr); nSamples = nrow(datExpr); MEs0 = moduleEigengenes(datExpr, moduleColors)$eigengenes MEs = orderMEs(MEs0) moduleTraitCor = cor(MEs, datTraits, use = "p"); moduleTraitPvalue = corPvalueStudent(moduleTraitCor, nSamples);

sizeGrWindow(10,6) textMatrix = paste(signif(moduleTraitCor, 2), "\n(", signif(moduleTraitPvalue, 1), ")", sep = ""); dim(textMatrix) = dim(moduleTraitCor) par(mar = c(6, 8.5, 3, 3)); labeledHeatmap(Matrix = moduleTraitCor, xLabels = names(datTraits), yLabels = names(MEs), ySymbols = names(MEs), colorLabels = FALSE, colors = blueWhiteRed (50), textMatrix = textMatrix, setStdMargins = FALSE, cex.text = 0.5, zlim = c(-1,1), main = paste("Module-trait relationships"))

SAA = as.data.frame(datTraits$SAA); names(SAA) = "SAA" modNames = substring(names(MEs), 3)

geneModuleMembership = as.data.frame(cor(datExpr, MEs, use = "p")); MMPvalue = as.data.frame(corPvalueStudent(as.matrix(geneModuleMembership), nSamples));

names(geneModuleMembership) = paste("MM", modNames, sep=""); names(MMPvalue) = paste("p.MM", modNames, sep="");

geneTraitSignificance = as.data.frame(cor(datExpr, SAA, use = "p")); GSPvalue = as.data.frame(corPvalueStudent(as.matrix(geneTraitSignificance), nSamples));

names(geneTraitSignificance) = paste("GS.", names(SAA), sep=""); names(GSPvalue) = paste("p.GS.", names(SAA), sep="");

module = "brown" column = match(module, modNames); moduleGenes = moduleColors==module;

sizeGrWindow(7, 7); par(mfrow = c(1,1)); verboseScatterplot(abs(geneModuleMembership[moduleGenes, column]), abs(geneTraitSignificance[moduleGenes, 1]), xlab = paste("Module Membership in", module, "module"), ylab = "Gene significance for body SAA", main = paste("Module membership vs. gene significance\n"), cex.main = 1.2, cex.lab = 1.2, cex.axis = 1.2, col = module) write.csv(textMatrix,file="textMatrix.csv") probes = names(datExpr) geneInfo = data.frame(ID = probes, moduleColor = moduleColors, geneTraitSignificance, GSPvalue) write.csv(geneInfo, file = "geneInfo.csv") corblack = moduleTraitCor[which(row.names(moduleTraitCor)=="MEblack"),] head(corblack[order(-corblack)])

names(datExpr)

names(datExpr)[moduleColors=="brown"]

ADD REPLY
0
Entering edit mode

Hi joletk,

I thought you were running one of the tutorial, but that chunk of code you just posted does not help me to understand where the error is. I guess that names(datExpr) give you Null, therefore I would need the chunk of code that you used to load the data and define datExpr object.

ADD REPLY
0
Entering edit mode

I did follow all the steps of the tutorial and it does work. but when i use my own data i get Null at the very last part of the code like stated above. all of the other steps work. would you like me to post all the code from the beginning?

ADD REPLY
0
Entering edit mode

Yes, please. The chunk of code used to prepare the object datExpr

ADD REPLY
0
Entering edit mode

Display the current working directory

setwd("C:/Users/Jolet/Dropbox/RNA-seq/Colon/analysis/WGCNA")

Load the WGCNA package

library(WGCNA);

The following setting is important, do not omit.

options(stringsAsFactors = FALSE);

Read in the female liver data set

femData = read.csv("GenesC.csv");

Take a quick look at what is in the data set:

dim(femData); names(femData);

transpose data to not include other colums only genes

datExpr0 = as.data.frame(t(femData[, -c(1:1)])); names(datExpr0) = femData$substanceBXH; rownames(datExpr0) = names(femData)[-c(1:1)];

check for samples with too many missing values

gsg = goodSamplesGenes(datExpr0, verbose = 3); gsg$allOK

only perform this step if last statement were false

if (!gsg$allOK) { # Optionally, print the gene and sample names that were removed: if (sum(!gsg$goodGenes)>0) printFlush(paste("Removing genes:", paste(names(datExpr0)[!gsg$goodGenes], collapse = ", "))); if (sum(!gsg$goodSamples)>0) printFlush(paste("Removing samples:", paste(rownames(datExpr0)[!gsg$goodSamples], collapse = ", "))); # Remove the offending genes and samples from the data: datExpr0 = datExpr0[gsg$goodSamples, gsg$goodGenes] }

check for outliers

sampleTree = hclust(dist(datExpr0), method = "average");

Plot the sample tree: Open a graphic output window of size 12 by 9 inches

The user should change the dimensions if the window is too large or too small.

sizeGrWindow(12,9)

pdf(file = "Plots/sampleClustering.pdf", width = 12, height = 9);

par(cex = 0.6); par(mar = c(0,4,2,0)) plot(sampleTree, main = "Sample clustering to detect outliers", sub="", xlab="", cex.lab = 1.5, cex.axis = 1.5, cex.main = 2)

if outliers excists cut it

you have to pick the cut value in this case it is fifteen

Plot a line to show the cut

abline(h = 90000, col = "red");

Determine cluster under the line

clust = cutreeStatic(sampleTree, cutHeight = 90000, minSize = 10) table(clust)

clust 1 contains the samples we want to keep.

keepSamples = (clust==1) datExpr = datExpr0[keepSamples, ] nGenes = ncol(datExpr) nSamples = nrow(datExpr)

load trait data

traitData = read.csv("TraitsC.csv"); dim(traitData) names(traitData)

allTraits = traitData[, -1]; dim(allTraits) names(allTraits) femaleSamples = rownames(datExpr); traitRows = match(femaleSamples, allTraits$Calf); datTraits = allTraits[traitRows, -1]; rownames(datTraits) = allTraits[traitRows, 1]; collectGarbage();

Re-cluster samples

sampleTree2 = hclust(dist(datExpr), method = "average")

Convert traits to a color representation: white means low, red means high, grey means missing entry

traitColors = numbers2colors(datTraits, signed = FALSE);

Plot the sample dendrogram and the colors underneath.

plotDendroAndColors(sampleTree2, traitColors, groupLabels = names(datTraits), main = "Sample dendrogram and trait heatmap")

save(datExpr, datTraits, file = "Colon-dataInput.RData")

ADD REPLY
0
Entering edit mode

Hi joletk

in:

datExpr0 = as.data.frame(t(femData[, -c(1:1)])); 
names(datExpr0) = femData$substanceBXH; 
rownames(datExpr0) = names(femData)[-c(1:1)];

I do not understand the -c(1:1). If you do not have have any outlier, simply run:

options(stringsAsFactors = FALSE);
vstData = read.table("vstData.txt", sep="\t", header=TRUE)
datExpr=data.frame(t(vstData[,2:121])) # my dataset has 120 samples
names(datExpr) = vstData$gene_id;
rownames(datExpr) = names(vstData)[c(2:121)]
ADD REPLY
0
Entering edit mode

Thank you!!! I got it to work.

ADD REPLY
0
Entering edit mode

Happy to hear that!!

ADD REPLY

Login before adding your answer.

Traffic: 1818 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6