Question: WGCNA code for chooseTopHubInEachModule
0
gravatar for krushnach80
21 months ago by
krushnach80570
krushnach80570 wrote:

I m using WGCNA for co-expression network im trying to find the hubgene finding fucntion but im getting error im not sure what im doing wrong.

library(WGCNA)
library(flashClust)
enableWGCNAThreads()
options(stringsAsFactors = FALSE);
countdata <- read.csv('UP_DOWN_WGCNA.csv', header=TRUE)
dim(countdata)
names(countdata)
femData <- countdata
names(femData)
datExpr = as.data.frame(t(femData[, -c(1)]))
View(datExpr)
dim(datExpr)
names(datExpr) = femData$gene
View(datExpr)
rownames(datExpr) = names(femData)[-c(1)]
dim(datExpr)
powers = c(c(1:10), seq(from = 12, to=20, by=2))
#powers = c(1:20)  # in practice this should include powers up to 20.

sft=pickSoftThreshold(datExpr,dataIsExpr = TRUE,powerVector = powers,corFnc = cor,corOptions = list(use = 'p'),networkType = "signed")
sizeGrWindow(9, 7)
par(mfrow = c(1,2))
cex1 = 0.9


plot(sft$fitIndices[,1], -sign(sft$fitIndices[,3])*sft$fitIndices[,2],xlab="Soft Threshold (power)",ylab="Scale Free Topology Model Fit, signed R^2",type="n", main = paste("Scale independence"))
text(sft$fitIndices[,1], -sign(sft$fitIndices[,3])*sft$fitIndices[,2],labels=powers,cex=cex1,col="red")

# Red line corresponds to using an R^2 cut-off
abline(h=0.1,col="red")

# Mean connectivity as a function of the soft-thresholding power
plot(sft$fitIndices[,1], sft$fitIndices[,5],xlab="Soft Threshold (power)",ylab="Mean Connectivity", type="n",main = paste("Mean connectivity"))
text(sft$fitIndices[,1], sft$fitIndices[,5], labels=powers, cex=cex1,col="red")

softPower = 19

adj= adjacency(datExpr,type = "unsigned", power = softPower)

#turn adjacency matrix into topological overlap to minimize the effects of noise and spurious associations
TOM=TOMsimilarityFromExpr(datExpr,networkType = "unsigned", TOMType = "unsigned", power = softPower)

colnames(TOM) =rownames(TOM) = datExpr
dissTOM=1-TOM

geneTree = flashClust(as.dist(dissTOM),method="average")
plot(geneTree, xlab="", sub="",cex=0.3)


minModuleSize = 20;
ds = 3
cutHeight = 0.99999

# Module identification using dynamic tree cut

dynamicMods = cutreeDynamic(dendro = geneTree,  method="tree", minClusterSize = minModuleSize);
dynamicMods = cutreeDynamic(dendro = geneTree, distM = dissTOM, method="hybrid", deepSplit = 2, pamRespectsDendro = FALSE, minClusterSize = minModuleSize);

#the following command gives the module labels and the size of each module. Lable 0 is reserved for unassigned genes
table(dynamicMods)
dynamicColors = labels2colors(dynamicMods)
table(dynamicColors)

plotDendroAndColors(geneTree, dynamicColors, "Dynamic Tree Cut", dendroLabels = FALSE, hang = 0.03, addGuide = TRUE, guideHang = 0.05, main = "Gene dendrogram and module colors")

restGenes= (dynamicColors != "red")
diss1=1-TOMsimilarityFromExpr(datExpr[,restGenes], power = softPower)
collectGarbage()


colnames(diss1) =rownames(diss1) = datExpr[restGenes]
hier1=flashClust(as.dist(diss1), method="average" )
plotDendroAndColors(hier1, dynamicColors[restGenes], "Dynamic Tree Cut", dendroLabels = FALSE, hang = 0.03, addGuide = TRUE, guideHang = 0.05, main = "Gene dendrogram and module colors")
diag(diss1) = NA
sizeGrWindow(7,7)
collectGarbage()
TOMplot(diss1, hier1, as.character(dynamicColors[restGenes]))

module_colors= setdiff(unique(dynamicColors), "grey")
for (color in module_colors){
  module=datExpr[which(dynamicColors==color)]
  write.table(module, paste("module_",color, ".txt",sep=""), sep="\t", row.names=TRUE, col.names=TRUE,quote=FALSE)
}

module.order <- unlist(tapply(1:ncol(datExpr),as.factor(dynamicColors),I))
m<-t(t(datExpr[,module.order])/apply(datExpr[,module.order],2,max))
heatmap(t(m),zlim=c(0,1),col= redWhiteGreen(256),Rowv=NA,Colv=NA,labRow=NA,scale="row",RowSideColors=dynamicColors[module.order])


MEList = moduleEigengenes(datExpr, colors = dynamicColors)
MEs = MEList$eigengenes
plotEigengeneNetworks(MEs, "", marDendro = c(0,4,1,2), marHeatmap = c(3,4,1,2))


#################################################
colnames(TOM) =rownames(TOM) = datExpr
tree1 <- tree2 <- fastcluster::hclust(as.dist(1-TOM),method="average")
colorh = labels2colors(datExpr)
hubs    = chooseTopHubInEachModule(datExpr, colorh)
hubs


Warning message:
In labels2colors(datExpr) :
  labels2colors: Number of labels exceeds number of avilable colors. Some colors will be repeated 1 times.
> hubs    = chooseTopHubInEachModule(datExpr, colorh)
Error in `[.data.frame`(datExpr, , colorh == m) : 
  undefined columns selected

Im getting this error

Warning message: In labels2colors(datExpr) : labels2colors: Number of labels exceeds number of avilable colors. Some colors will be repeated 1 times.

hubs = chooseTopHubInEachModule(datExpr, colorh) Error in [.data.frame(datExpr, , colorh == m) : undefined columns selected

Any help or suggestion would be highly appreciated , am I would like to know if my script is correct or not ..

rna-seq R • 1.6k views
ADD COMMENTlink modified 20 months ago by RamRS23k • written 21 months ago by krushnach80570
1
gravatar for Kevin Blighe
20 months ago by
Kevin Blighe46k
Kevin Blighe46k wrote:

What are the contents of datExpr?

I presume that you have followed the tutorial by Steve Horvath?

ADD COMMENTlink written 20 months ago by Kevin Blighe46k

yes i followed the tutorial ,and "datExpr" contains my gene expression values except gene name ...

ADD REPLYlink written 20 months ago by krushnach80570
1

Okay, great! What's the output of labels2colors(datExpr)?

ADD REPLYlink written 20 months ago by Kevin Blighe46k

this is the message i get

Warning message: In labels2colors(datExpr) : labels2colors: Number of labels exceeds number of avilable colors. Some colors will be repeated 1 times.
ADD REPLYlink written 20 months ago by krushnach80570

Ah, I think that I see what's happening...

WGCNA may have identified just too many modules in your data, i.e., more modules that there are colours (by name) in the R palette! WGCNA, a you know, automatically assigns colour names to each palette.

Any way to see if that is the case? - how many modules were identified?

ADD REPLYlink written 20 months ago by Kevin Blighe46k

41 modules I get should i increase minModuleize which i set to 20 if i increase would it help?

ADD REPLYlink written 20 months ago by krushnach80570

No, not so sure that's the issue.

I think that you need to change this command:

colorh = labels2colors(datExpr)

With that, you may be assigning colours to every value in the entire datExpr object, which would explain why it complains that there are not enough available colours.

That command should just be applied to the module IDs/names, such as:

colorh = labels2colors(rownames(datExpr))

or

colorh = labels2colors(colnames(datExpr))
ADD REPLYlink written 20 months ago by Kevin Blighe46k

i increase minModulesize to 30 now i get like 17 module but still im getting the same issue im not sure what to do ..but i will try your suggestion

ADD REPLYlink written 20 months ago by krushnach80570

may be you can suggest me how to execute the chooseTopHubInEachModule argument

ADD REPLYlink written 20 months ago by krushnach80570
1

No, I'm highly confident that the problem relates to:

colorh = labels2colors(datExpr)

You are attempting to assign a unique colour to all values in datExpr. You only want to assign colours to your identified modules.

Take a look gain at the WGCNA tutorial:

moduleLabels = net$colors;
#Convert the numeric labels to color labels
moduleColors = labels2colors(moduleLabels)

[source: https://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/Consensus-NetworkConstruction-auto.pdf - page 5]

This, then, causes the error in chooseTopHubInEachModule

ADD REPLYlink written 20 months ago by Kevin Blighe46k
1

Also look up in your own code where you previously used the same function:

dynamicColors = labels2colors(dynamicMods)

Always check the contents of your objects so that you understand in which contexts they are to be used.

ADD REPLYlink written 20 months ago by Kevin Blighe46k
1

thank you very much problem solved ,it takes a lot of time for me to go through various resources over the internet then compile and run the code and people like you are of really great help

ADD REPLYlink written 20 months ago by krushnach80570
1

Absolutely no problem - glad to have helped out.

When I have time, I go through tutorials like WGCNA line by line and make my own comments about what is happening. I then save the script for years later

ADD REPLYlink written 20 months ago by Kevin Blighe46k

yes assign colour to the module

ADD REPLYlink written 20 months ago by krushnach80570

yes assign colour to the module

ADD REPLYlink written 20 months ago by krushnach80570
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1545 users visited in the last hour