Question

Deseq2 normalization for WGCNA

1

Entering edit mode

3.5 years ago

jms2520 ▴ 50

Hello all,

I have a large RNA-seq dataset that I am trying to run WGCNA on. The data set has many variables including (including 3 brain regions, 3 ages, sex, 2 genotypes.) I have been advised to use Deseq2 normalization for these samples prior to doing WGCNA.

I am struggling with two aspects. 1) I assume I am not actually doing Deseq2() differential expression on this data, rather just using the normalization method of varianceStabilizingTransformation on the raw counts

2) I am struggling with the "design" aspect of these functions as I think that all three aspects (region, age, sex and genotype) should all play a role in the design. Does the design even matter if I am not using this for differential expression? Is even used for normalization or just for the differential expression?

countData= read.csv("Raw.Counts.csv, sep = ",", rownames=Genes)
colData=read.csv("Sample.info.numerical.csv", sep = ",", rownames=Sample_ID)
dds<- DESeqDataSetFromMatrix(countData, colData, design=???)
dds <- estimateSizeFactors(dds)
vsd <- varianceStabilizingTransformation(dds, blind=T)
normalized_counts <- counts(dds, normalized=TRUE)

Deseq2 RNAseq WGCNA • 3.5k views

ADD COMMENT • link updated 3.5 years ago by andres.firrincieli 3.9k • written 3.5 years ago by jms2520 ▴ 50

0

Entering edit mode

Cross-posted: https://support.bioconductor.org/p/9141259/#9141259

ADD REPLY • link 3.5 years ago by Kevin Blighe 89k

score 0 · Answer 1 · 2021-12-20

0

Entering edit mode

3.5 years ago

ATpoint 88k

Don't overthink this. Just take any of what the FAQ recommend in section 4. By the way, if you use blind=TRUE then the design anyway is ignored. Just use that and proceed with the analysis. Or log2(normalized_counts+1), I doubt it really makes a notable difference.

ADD COMMENT • link 3.5 years ago by ATpoint 88k

0

Entering edit mode

I am very familiar with the FAQ page from the WGCNA tutorials. I have previously run the analysis on my normalized RNA seq data from limma voom. However the weighting of the samples using this pipeline is not appropriate for WGCNA as I have been advised. So I was hoping to use deseq2 for normalization as was suggested in the FAQ.

Are you implying that I just need to read in the counts matrix and then immediately apply:

countData= read.csv("Raw.Counts.csv, sep = ",", rownames=Genes)
vsd<-varianceStabilizingTransformation(countData, blind=T)

ADD REPLY • link 3.5 years ago by jms2520 ▴ 50

0

Entering edit mode

Yes. If you have a design that you're confident with you can also include it, see See also https://support.bioconductor.org/p/115583/#115585

ADD REPLY • link 3.5 years ago by ATpoint 88k

0

Entering edit mode

However I still need to use a Deseq object to use varianceStabilizingTransformation correct? I cannot just use this on regular count data in a matrix can I? I don't need to use estimateSizeFactors() either?

ADD REPLY • link 3.5 years ago by jms2520 ▴ 50

0

Entering edit mode

varianceStabilizingTransformation require a DESeq object (dds) and you do not need to run estimateSizeFactors or estimateDispersion because these are performed within the varianceStabilizingTransformation function:

function (object, blind = TRUE, fitType = "parametric") 
{
    if (is.null(colnames(object))) {
        colnames(object) <- seq_len(ncol(object))
    }
    if (is.matrix(object)) {
        matrixIn <- TRUE
        object <- DESeqDataSetFromMatrix(object, DataFrame(row.names = colnames(object)), 
            ~1)
    }
    else {
        matrixIn <- FALSE
    }
    if (is.null(sizeFactors(object)) & is.null(normalizationFactors(object))) {
        object <- estimateSizeFactors(object)
    }
    if (blind) {
        design(object) <- ~1
    }
    if (blind | is.null(attr(dispersionFunction(object), "fitType"))) {
        object <- estimateDispersionsGeneEst(object, quiet = TRUE)
        object <- estimateDispersionsFit(object, quiet = TRUE, 
            fitType)
    }
    vsd <- getVarianceStabilizedData(object)
    if (matrixIn) {
        return(vsd)
    }
    se <- SummarizedExperiment(assays = vsd, colData = colData(object), 
        rowRanges = rowRanges(object), metadata = metadata(object))
    DESeqTransform(se)
}

To get the normalized data you need to run:

dds<-DESeqDataSetFromMatrix(countData = countData_m, colData = colData, design = ~ Condition) #this is just an example
deseq2VST <- varianceStabilizingTransformation(dds, blind = TRUE)
deseq2VST <- assay(deseq2VST)
deseq2VST <- as.data.frame(deseq2VST)

ADD REPLY • link 3.5 years ago by andres.firrincieli 3.9k