Question: sva + egdeR - differential expression analysis - RNA-seq data
gravatar for mrodrigues.fernanda
5.2 years ago by
United States / Saint Louis / Washington University in Saint Louis
mrodrigues.fernanda50 wrote:

Dear list,
I am performing an RNA-seq analysis for differential gene expression and I have a question regarding the use of the package sva for the estimation of unknown batch effects.

In the sva vignette, it shows examples of using the package for estimation of surrogate variables and then performing DE analysis using the package limma (I am referring to the section 6 of the sva vignette: "Adjusting for surrogate variables using the limma package")

Is that possible to do the same using the package edgeR instead of limma?

Or is sva not compatible with edgeR?

Sorry if this is a dumb question. I am a little new to the bioinformatics world.


Thank you!!!

edger rna-seq sva • 4.8k views
ADD COMMENTlink modified 15 months ago by Biostar ♦♦ 20 • written 5.2 years ago by mrodrigues.fernanda50

For RNA-seq data, you should use the svaseq() function instead of sva(). That's true whether you're using limma voom, edgeR or DESeq. The author also recommends scaling and normalizing the counts before running SVA.

Basic example, assuming counts is your count matrix and clin is your clinical data file:

y <- DGEList(counts)
y <- calcNormFactors(y)
mod <- model.matrix(~ Condition, data=clin)
mod0 <- model.matrix(~ 1, data=clin)
svobj <- svaseq(cpm(y), mod, mod0) 
des <- cbind(mod, svobj$sv)

You can now proceed with des as your design matrix.

ADD REPLYlink written 4.5 years ago by d.watson10

Hi Watson, thanks for the reply. I have a question for next steps. So then we have des as design matrix: disp = estimateDisp (???, design) What should we use as data for estimateDisp?

ADD REPLYlink written 3.5 years ago by pwwang30
gravatar for h.mon
5.2 years ago by
h.mon32k wrote:

In my (shallow) understanding, no: sva manual suggests a log( g[ij] + c ) transformation, whereas edgeR uses the negative binomial to model read counts, and specifically states that only read counts should be used. You may use sva + voom + limma; or including batch effects on your glm model and proceedign with edgeR.

ADD COMMENTlink written 5.2 years ago by h.mon32k
gravatar for Devon Ryan
5.2 years ago by
Devon Ryan98k
Freiburg, Germany
Devon Ryan98k wrote:

You end up just adding columns to your model matrix in edgeR. Here's a similar discussion about SVA and DESeq2: Batch effect in DESeq2 - multiple factor or SVA?

The same principles apply.

ADD COMMENTlink written 5.2 years ago by Devon Ryan98k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1415 users visited in the last hour