Question: DESeq2: vst() and varianceStabilizingTransformation()
gravatar for lenC_biotecLover
12 weeks ago by
lenC_biotecLover0 wrote:

Hi everyone, I'm exploring the DESeq2 package, in particular the varianceStabilizingTransformation function. I can't completely understand the differences between this and the vst function: when should I use them, and why should I prefer one or the other? Thank you

normalization rna-seq sva R • 387 views
ADD COMMENTlink modified 12 weeks ago by Kevin Blighe67k • written 12 weeks ago by lenC_biotecLover0
gravatar for Kevin Blighe
12 weeks ago by
Kevin Blighe67k
Republic of Ireland
Kevin Blighe67k wrote:

The difference is subtle but means that vst() can perform the transformation quicker.

vst() is, in fact, a wrapper function of varianceStabilizingTransformation() - it (vst) first identifies 1000 variables that are 'representative' of the dataset's dispersion trend, and uses the information from these to perform the transformation.

The key parameter in question is:

vst(..., nsub = 1000)


There is also a difference relating to the usage of blind:


This is a wrapper for the varianceStabilizingTransformation (VST) that provides much faster estimation of the dispersion trend used to determine the formula for the VST. The speed-up is accomplished by subsetting to a smaller number of genes in order to estimate this dispersion trend. The subset of genes is chosen deterministically, to span the range of genes' mean normalized count. This wrapper for the VST is not blind to the experimental design: the sample covariate information is used to estimate the global trend of genes' dispersion values over the genes' mean normalized count. It can be made strictly blind to experimental design by first assigning a design of ~1 before running this function, or by avoiding subsetting and using varianceStabilizingTransformation.

However, if you set blind = TRUE for vst(), it seems to set the design to ~ 1 for you:

function (object, blind = TRUE, nsub = 1000, fitType = "parametric") 
        if (blind) {
            design(object) <- ~1
        matrixIn <- FALSE
    vsd <- varianceStabilizingTransformation(object, blind = FALSE)


This function calculates a variance stabilizing transformation (VST) from the fitted dispersion-mean relation(s) and then transforms the count data (normalized by division by the size factors or normalization factors), yielding a matrix of values which are now approximately homoskedastic (having constant variance along the range of mean values). The transformation also normalizes with respect to library size. The rlog is less sensitive to size factors, which can be an issue when size factors vary widely. These transformations are useful when checking for outliers or as input for machine learning techniques such as clustering or linear discriminant analysis.


ADD COMMENTlink modified 12 weeks ago • written 12 weeks ago by Kevin Blighe67k

Thank you very much, it is more clear now. I'm still not very comfortable with bioinformatics, so some concepts appear a bit difficult to understand for me, even if I study vignettes and documentation.

ADD REPLYlink written 12 weeks ago by lenC_biotecLover0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1106 users visited in the last hour