Hi, I have a question about some of the R syntax present in the DeSeq2 vignette. The following comes directly from the section regarding the removal of batch effects:
mat <- assay(vsd)
mm <- model.matrix(~condition, colData(vsd))
mat <- limma::removeBatchEffect(mat, batch=vsd$batch, design=mm)
assay(vsd) <- mat
plotPCA(vsd)
What does line 4 of this code do? More broadly, it seems like this syntax is "assigning" the object mat to the function assay(vsd), and I am having trouble making sense of that. Does this language replace the entire function of assay() with the limma function listed above?
Unfortunately, neither the vignette nor google seems to have an answer for me so far. Apologies if this is a basic question, and thanks so much for any clarification!
If you run the vst() function on an object of class DESeqDataSet you get the results wrapped in a class called DESeqTransform, which basically a SummarizedExperiment (SE). A SE can store one or many count matrices which are called assays. To access an assay you can do (assuming your SE is called se):
assay(se, assayname)
If you only use assay(se) then the first assay is returned. Use assayNames(se) to list all available assays.
So the code does the following:
#/ this extracts the vsd assay (so the vst-corrected counts) from the vsd which is the DESeqTransform
#/ which is the first (and only) assay by default
mat <- assay(vsd)
#/ limma stuff...
mm <- model.matrix(~condition, colData(vsd))
mat <- limma::removeBatchEffect(mat, batch=vsd$batch, design=mm)
#/ this feeds the batch-corrected counts (=mat) back into the DESeqTransform object called vsd
assay(vsd) <- mat
The reason the last line of code above is done is that the plotPCA function works only with a DESeqTransform object, hence the mat is put back into the vsd object which is a DESeqTransform. This is not "assigning" mat to a function. Instead the assay command is a so-called setter function which is meant to put an object (here mat) into the correct location of the vsd. The assay serves both as a setter and also as a getter (so retrieval) function. This is a principle to conveniently put data into and extract from certain classes rather than having the user find these slots with access operators such as object$(...) or object@(...).
Thank you so much, this was a brilliantly clear explanation! The missing piece of information for me was the fact that multiple assays could be stored under a single DeSeqTransform or SE object, so I understand now that it is just a more specific version of object assignment. Thank you very much for explaining this!
Hi ATpoint,
Thank you so much, this was a brilliantly clear explanation! The missing piece of information for me was the fact that multiple assays could be stored under a single DeSeqTransform or SE object, so I understand now that it is just a more specific version of object assignment. Thank you very much for explaining this!
-Jared