Question: DESeq and Limma+Voom Normalization for Rna-Seq Data Using Ercc Spike-In
0
gravatar for komal.rathi
3.3 years ago by
komal.rathi3.4k
Children's Hospital of Philadelphia, Philadelphia, PA
komal.rathi3.4k wrote:

Hello everyone,

I have some RNASeq data, that has been spiked-in with exogenous ERCC controls. I have the counts from htseq-count and now I want to normalize + test for differential expression. For this, I am using two methods: DESeq and limma+voom. Below is my code for both the tests. Can someone tell me if what I am doing is correct or not? I do think that I am slightly mistaken in the voom code. 

##### Using DESeq
library(DESeq)
# get the size factors from the spiked-ins
# dat.spike.in is count data of only spike ins
cds <- newCountDataSet(countData = dat.spike.in, conditions = as.character(condition))
cds <- estimateSizeFactors(cds)
nf <- sizeFactors(cds) 

# add the size factors to data without spike-ins
# dat.without.spike.in is count data where spike-ins are removed
cds <- newCountDataSet(countData = dat.without.spike.in, conditions = as.character(condition))
sizeFactors(cds) <- nf 

# differential expression test
cds <- estimateDispersions( cds, method= 'pooled', sharingMode='maximum', fitType='local') 
deseq.res <- nbinomTest( cds, "A", "B")

##### Using Limma + Voom
library(limma)
library(edgeR)
groups = as.factor(condition) # conditions 
design <- model.matrix(~0+groups)
colnames(design) = levels(groups)
nf <- calcNormFactors(dat.spike.in) # calculation norm factors using spike ins only
voom.norm <- voom(dat.without.spike.in, design, lib.size = colSums(dat.spike.in) * nf) # add that to voom

Awaiting your input and suggestions!

Thanks!

voom rna-seq deseq limma spike-in • 2.6k views
ADD COMMENTlink modified 3.3 years ago by Gordon Smyth680 • written 3.3 years ago by komal.rathi3.4k
1

I can at least say that the DESeq code is correct (though you should be using DESeq2 instead). It's been long enough since I've used limma/voom that I'll wait for someone else to comment.

ADD REPLYlink written 3.3 years ago by Devon Ryan88k

Thank you, I still don't know if this is the best way i.e. normalizing with ERCC spike-ins. But atleast this gives me some starting point. 

ADD REPLYlink written 3.3 years ago by komal.rathi3.4k

In general, I would only normalize against the spike-ins if absolutely needed (there are definite cases where this is needed). They're generally less robust.

ADD REPLYlink written 3.3 years ago by Devon Ryan88k
1

I am using Limma and Voom and I think I also have similar workflow like yours.

ADD REPLYlink written 3.3 years ago by bharata1803420

Thank you! Do you know of any reference papers that have used the same? 

ADD REPLYlink written 3.3 years ago by komal.rathi3.4k

Why calcNormFactors only use dat.spike.in ?

ADD REPLYlink written 3.3 years ago by Zhilong Jia1.4k

Because the ERCC spike-ins are being used for normalization. This is needed in cases like single-cell sequencing or whenever else there might be transcriptional amplification.

ADD REPLYlink written 3.3 years ago by Devon Ryan88k
3
gravatar for Gordon Smyth
3.3 years ago by
Gordon Smyth680
Australia
Gordon Smyth680 wrote:

https://support.bioconductor.org/p/74870/

ADD COMMENTlink written 3.3 years ago by Gordon Smyth680
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 885 users visited in the last hour