Question: DESeq and Limma+Voom Normalization for Rna-Seq Data Using Ercc Spike-In
0
5.2 years ago by
komal.rathi3.7k
Children's Hospital of Philadelphia, Philadelphia, PA
komal.rathi3.7k wrote:

Hello everyone,

I have some RNASeq data, that has been spiked-in with exogenous ERCC controls. I have the counts from htseq-count and now I want to normalize + test for differential expression. For this, I am using two methods: DESeq and limma+voom. Below is my code for both the tests. Can someone tell me if what I am doing is correct or not? I do think that I am slightly mistaken in the voom code.

```##### Using DESeq
library(DESeq)
# get the size factors from the spiked-ins
# dat.spike.in is count data of only spike ins
cds <- newCountDataSet(countData = dat.spike.in, conditions = as.character(condition))
cds <- estimateSizeFactors(cds)
nf <- sizeFactors(cds)

# add the size factors to data without spike-ins
# dat.without.spike.in is count data where spike-ins are removed
cds <- newCountDataSet(countData = dat.without.spike.in, conditions = as.character(condition))
sizeFactors(cds) <- nf

# differential expression test
cds <- estimateDispersions( cds, method= 'pooled', sharingMode='maximum', fitType='local')
deseq.res <- nbinomTest( cds, "A", "B")

##### Using Limma + Voom
library(limma)
library(edgeR)
groups = as.factor(condition) # conditions
design <- model.matrix(~0+groups)
colnames(design) = levels(groups)
nf <- calcNormFactors(dat.spike.in) # calculation norm factors using spike ins only
voom.norm <- voom(dat.without.spike.in, design, lib.size = colSums(dat.spike.in) * nf) # add that to voom```

Awaiting your input and suggestions!

Thanks!

voom rna-seq deseq limma spike-in • 4.0k views
ADD COMMENTlink
modified 5.2 years ago by Gordon Smyth2.1k • written 5.2 years ago by komal.rathi3.7k
1

I can at least say that the DESeq code is correct (though you should be using DESeq2 instead). It's been long enough since I've used limma/voom that I'll wait for someone else to comment.

ADD REPLYlink written 5.2 years ago by Devon Ryan98k

Thank you, I still don't know if this is the best way i.e. normalizing with ERCC spike-ins. But at least this gives me some starting point.

ADD REPLYlink modified 14 months ago by _r_am32k • written 5.2 years ago by komal.rathi3.7k

In general, I would only normalize against the spike-ins if absolutely needed (there are definite cases where this is needed). They're generally less robust.

ADD REPLYlink written 5.2 years ago by Devon Ryan98k
1

I am using Limma and Voom and I think I also have similar workflow like yours.

ADD REPLYlink written 5.2 years ago by bharata1803490

Thank you! Do you know of any reference papers that have used the same?

ADD REPLYlink modified 14 months ago by _r_am32k • written 5.2 years ago by komal.rathi3.7k

Why calcNormFactors only use `dat.spike.in`?

ADD REPLYlink modified 14 months ago by _r_am32k • written 5.2 years ago by Zhilong Jia1.6k

Because the ERCC spike-ins are being used for normalization. This is needed in cases like single-cell sequencing or whenever else there might be transcriptional amplification.

ADD REPLYlink written 5.2 years ago by Devon Ryan98k
3
5.2 years ago by
Gordon Smyth2.1k
Australia
Gordon Smyth2.1k wrote:
ADD COMMENTlink written 5.2 years ago by Gordon Smyth2.1k
Please log in to add an answer.

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2072 users visited in the last hour
_