Hi,
I am currently analyzing 90 RNA-seq samples. I am interested in using WGCNA for my analysis. Nonetheless, when I read the tutorials provided by Steve's group, I felt a little bit lost with some details like probe, and etc. Since most of the command lines are geared towards microarray data sets, I am having a little bit difficulty when I tried running the suggested command lines. I omitted the ones related to microarray in my analysis. And, I kept getting error or warning messages. I could not even continue past soft thresholding calculation step. If you have used WGCNA to analyze RNA-seq data sets using gene counts as input data, can you please share with me the R-script you used for your analysis just to give me an idea how to proceed with mine? I am using normalized and transformed (variance stabilizing transformation in DESeq2) gene counts as my input data.
Edit: Here's what I have so far:
#Normalizing and transforming gene counts
> WGCNATest = read.table("C:/Users/yfy/Desktop/NewMicroscopyDE.txt", row.names =1 , header = T, sep = "\t")
> colData=data.frame(row.names = colnames(WGCNATest),
+temp=c("20C","20C","20C","20C","20C","20C","30C","30C","30C","30C","30C","30C","20C","20C","20C","20C","20C","20C","30C","30C","30C","30C","30C","30C","20C","20C","20C","20C","20C","20C","30C","30C","30C","30C","30C","30C"),
+genotype=c("PI","PI","PI","PI","PI","PI","PI","PI","PI","PI","PI","PI","SAL","SAL","SAL","SAL","SAL","SAL","SAL","SAL","SAL","SAL","SAL","SAL","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi"),
+time=c("6","6","6","12","12","12","6","6","6","12","12","12","6","6","6","12","12","12","6","6","6","12","12","12","6","6","6","12","12","12","6","6","6","12","12","12"))
> dds = DESeqDataSetFromMatrix(countData = WGCNATest, colData = colData, design = ~genotype+time+temp)
> colData(dds)$time = relevel(colData(dds)$ time, "6")
> dds2 = DESeq(dds, betaPrior = FALSE)
> varianceStabilizingTransformation(dds3, blind=TRUE)
#Automatic construction of the gene network and identification of modules
> WGCNATree = flashClust(dist(colData), method = "average")
> sizeGrWindow(12,9)
NULL
> par(cex = 0.6)
> par(mar = c(0,4,2,0))
> plot(WGCNATree, main = "Sample clustering to detect outliers", sub="", xlab="", cex.lab = 1.5,cex.axis = 1.5, cex.main = 2)
> powers = c(c(1:10), seq(from = 12, to=20, by=2))
> options(stringsAsFactors = FALSE)
> sft = pickSoftThreshold(colData, powerVector = powers, verbose = 5)
pickSoftThreshold: will use block size 3.
pickSoftThreshold: calculating connectivity for given powers...
..working on genes 1 through 3 of 3
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
0 (non-NA) cases
In addition: Warning messages:
1: executing %dopar% sequentially: no parallel backend registered
2: In (function (x, y = NULL, use = "all.obs", method = c("pearson", :
NAs introduced by coercion
3: In (function (x, y = NULL, use = "all.obs", method = c("pearson", :
NAs introduced by coercion
4: In eval(expr, envir, enclos) :
Some correlations are NA in block 1 : 3 .
5: In as.vector(log10(dk)) : NaNs produced
Why don't you post what you have so far up until the first error?