Question

ERCC normalization within DESeq2

0

Entering edit mode

12 months ago

Assa Yeroslaviz ★ 1.8k

Hi,

I would like to make sure I understand how to do it, if I have a data set with ERCC spike-ins in bulk-RNASeq

After integrating the ERCC sequences into the genome (both fastA and gtf) in question, I can just map the fastq files as always. I can then quantify the resulted bam files using e.g. featureCounts to get my raw count table.

To get the size.factors for the ERCCs, Do I need to sub-set the count table to only the ERCC "genes" and than calculate them alone?

Or can I use the

dds <- estimateSizeFactors(dds, controlGenes= <names or numeric index of my ERCC features> )
dds <- DESeq(dds, ...)

thanks

normalization ERCC size.factors DESeq2 • 634 views

ADD COMMENT • link updated 12 months ago by ATpoint 82k • written 12 months ago by Assa Yeroslaviz ★ 1.8k

score 3 · Accepted Answer · 2023-05-02

3

Entering edit mode

12 months ago

Carlo Yague 8.7k

Both methods should lead to the same results, but estimateSizeFactors(dds, controlGenes=... ) is arguably more convenient.

ADD COMMENT • link 12 months ago by Carlo Yague 8.7k

2

Entering edit mode

It's the same, see https://github.com/mikelove/DESeq2/blob/devel/R/core.R#L559-L577

controlGenes internally subsets the dds object to the controls and derives the size factors from that. Random note: You can do that in edgeR as well, so calculating the TMM factors with any subset of genes and then putting them back to the main DGEList, just that you have to do it manually whereas DESeq2 has this convenience argument controlGenes.

ADD REPLY • link 12 months ago by ATpoint 82k