Question: Remove Surrogate Variables and Batch
2
gravatar for tucanj
4.9 years ago by
tucanj80
Canada
tucanj80 wrote:

Is there a way to output a gene expression matrix corrected for batch (I know I can use ComBat) and surrogate variables from SVA? I would like to visualize the results of a limma differential expression however they are confounded and thus not amenable to visualization.

combat sva R • 5.2k views
ADD COMMENTlink modified 3.0 years ago by ddiez1.8k • written 4.9 years ago by tucanj80
10
gravatar for brentp
4.9 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

If you have:

  • 'y' as the gene expresion matrix
  • 'mod' as the model matrix you sent to sva (the full model)
  • 'svs' as svobj$sv where svobj is the output from the sva function

then you can use the function below to get a "cleaned" version of the matrix with the surrogate variables removed.

cleanY = function(y, mod, svs) {
    X = cbind(mod, svs)
    Hat = solve(t(X) %*% X) %*% t(X)
    beta = (Hat %*% t(y))
    rm(Hat)
    gc()
    P = ncol(mod)
    return(y - t(as.matrix(X[,-c(1:P)]) %*% beta[-c(1:P),]))
}

 

I modified this from a post by Andrew Jaffe here:

http://permalink.gmane.org/gmane.science.biology.informatics.conductor/42857

ADD COMMENTlink written 4.9 years ago by brentp23k

The joys of linear algebra :)

ADD REPLYlink written 4.9 years ago by Devon Ryan92k
4
gravatar for jtleek
3.0 years ago by
jtleek40
jtleek40 wrote:

It is not recommended that you perform differential expression analysis after data cleaning - the resulting degrees of freedom will be wrong. The best thing to do is include the svs as covariates in downstream linear models in limma etc. as adjustment models.

  • Jeff (co-developer of sva)
ADD COMMENTlink written 3.0 years ago by jtleek40
1

Dear Jeff, is it correct to perform a Gene regulatory network (GRN) analysis after the data cleaning with the function mentioned before?

ADD REPLYlink written 2.2 years ago by lessismore710
3
gravatar for ddiez
3.0 years ago by
ddiez1.8k
Japan
ddiez1.8k wrote:

With Jeff's answer in mind, if you just want to remove the effect of batch variables for visualisation (e.g. heatmap, PCA), the limma function removeBatchEffect() can be used for that purpose.

ADD COMMENTlink written 3.0 years ago by ddiez1.8k

Hello, could you be more clear? why just for visualization? are you saying that the Limma corrected dataset cannot be used for other purposes?

ADD REPLYlink written 2.2 years ago by lessismore710

Take a look at ?limma::removeBatchEffect for details of other uses of the processed dataset. For differential expression it is better to use the original dataset and add batch as covariate, as indicated in @jtleek's answer.

ADD REPLYlink written 2.2 years ago by ddiez1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 3531 users visited in the last hour