R-Script To Analyze Rna-Seq Data Set Using Wgcna
3
4
Entering edit mode
10.1 years ago
yfyian ▴ 40

Hi,

I am currently analyzing 90 RNA-seq samples. I am interested in using WGCNA for my analysis. Nonetheless, when I read the tutorials provided by Steve's group, I felt a little bit lost with some details like probe, and etc. Since most of the command lines are geared towards microarray data sets, I am having a little bit difficulty when I tried running the suggested command lines. I omitted the ones related to microarray in my analysis. And, I kept getting error or warning messages. I could not even continue past soft thresholding calculation step. If you have used WGCNA to analyze RNA-seq data sets using gene counts as input data, can you please share with me the R-script you used for your analysis just to give me an idea how to proceed with mine? I am using normalized and transformed (variance stabilizing transformation in DESeq2) gene counts as my input data.

Edit: Here's what I have so far:

#Normalizing and transforming gene counts
> WGCNATest = read.table("C:/Users/yfy/Desktop/NewMicroscopyDE.txt", row.names =1 , header = T, sep = "\t")
> colData=data.frame(row.names = colnames(WGCNATest),
    +temp=c("20C","20C","20C","20C","20C","20C","30C","30C","30C","30C","30C","30C","20C","20C","20C","20C","20C","20C","30C","30C","30C","30C","30C","30C","20C","20C","20C","20C","20C","20C","30C","30C","30C","30C","30C","30C"),
    +genotype=c("PI","PI","PI","PI","PI","PI","PI","PI","PI","PI","PI","PI","SAL","SAL","SAL","SAL","SAL","SAL","SAL","SAL","SAL","SAL","SAL","SAL","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi","RNAi"),
    +time=c("6","6","6","12","12","12","6","6","6","12","12","12","6","6","6","12","12","12","6","6","6","12","12","12","6","6","6","12","12","12","6","6","6","12","12","12"))
> dds = DESeqDataSetFromMatrix(countData = WGCNATest, colData = colData, design = ~genotype+time+temp)
> colData(dds)$time = relevel(colData(dds)$ time, "6")
> dds2 = DESeq(dds, betaPrior = FALSE)
> varianceStabilizingTransformation(dds3, blind=TRUE)

#Automatic construction of the gene network and identification of modules
> WGCNATree = flashClust(dist(colData), method = "average")
> sizeGrWindow(12,9)
NULL
> par(cex = 0.6)
> par(mar = c(0,4,2,0))
> plot(WGCNATree, main = "Sample clustering to detect outliers", sub="", xlab="", cex.lab = 1.5,cex.axis = 1.5, cex.main = 2)
> powers = c(c(1:10), seq(from = 12, to=20, by=2))
> options(stringsAsFactors = FALSE)
> sft = pickSoftThreshold(colData, powerVector = powers, verbose = 5)
pickSoftThreshold: will use block size 3.
 pickSoftThreshold: calculating connectivity for given powers...
   ..working on genes 1 through 3 of  3
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  0 (non-NA) cases
In addition: Warning messages:
1: executing %dopar% sequentially: no parallel backend registered 
2: In (function (x, y = NULL, use = "all.obs", method = c("pearson",  :
  NAs introduced by coercion
3: In (function (x, y = NULL, use = "all.obs", method = c("pearson",  :
  NAs introduced by coercion
4: In eval(expr, envir, enclos) :
  Some correlations are NA in block 1 : 3 .
5: In as.vector(log10(dk)) : NaNs produced
rnaseq • 12k views
ADD COMMENT
0
Entering edit mode

Why don't you post what you have so far up until the first error?

ADD REPLY
3
Entering edit mode
10.1 years ago

Your problem appears to actually start here:

> varianceStabilizingTransformation(dds3, blind=TRUE)

which should be:

> vsd = getVarianceStabilizedData(dds2, blind=TRUE)

Then, remember that you normally want to work with the transformed counts rather than the information describing what group(s) each of the values in the count matrix is in (colData). So, this

> WGCNATree = flashClust(dist(colData), method = "average")

should likely be something like:

> vsd2 <- t(vsd) #Many functions expect the matrix to be transposed
> WGCNATree = flashClust(dist(vsd2), method = "average")

Similarly, instead of:

> sft = pickSoftThreshold(colData, powerVector = powers, verbose = 5)

you actually want something like:

> sft = pickSoftThreshold(vsd2, powerVector = powers, verbose = 5)
ADD COMMENT
0
Entering edit mode

Hi dpryan!!

Thank you for your quick response. I tried the command line on vsd you suggested. Here's what I got:

vsd = getVarianceStabilizedData(dds2, blind=TRUE) Error in getVarianceStabilizedData(dds2, blind = TRUE) : unused argument (blind = TRUE)

It worked when I took out the command blind=TRUE. But, I'd like the transformation to be blind to the sample information specified by the design formula. Hence, I should use blind=TRUE. How do I get it to work together with blind=TRUE?

Thanks again, Yoong

ADD REPLY
0
Entering edit mode

Hi dpryan,

Here's what I did after sending the above message:

# to transform data without using information specified by design formula
> varianceStabilizingTransformation(dds2, blind=TRUE)
> vsd = getVarianceStabilizedData(dds2)

I think my data has been transformed without using any info specified by design formula. And, I assigned new object vsd into my R-analysis. I think that's what you wanted to achieve, right?

Yoong

ADD REPLY
0
Entering edit mode

I've generally assumed that getVarianceStabilizedData() always uses blind, since that's the default for varianceStabilizingTransformation(). BTW, just running varianceStabilizingTransformation(dds2, blind=TRUE) without assigning the resulting SummarizedExperiment object to something is pointless (i.e., you're not actually doing anything, since the function doesn't modify it's input). If you really want to be sure, then:

vsd <- varianceStabilizingTransformation(dds2, blind=TRUE)
vc <- assay(vsd)

or something like that

ADD REPLY
0
Entering edit mode

I managed to get my soft threshold (power) with your suggestions. Thanks a lot for your input!!!!!

ADD REPLY

Login before adding your answer.

Traffic: 1966 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6