Question: Normalization scaling factors: formula for applying them to raw counts
0
gravatar for sovrappensiero
8 months ago by
sovrappensiero10 wrote:

Hello,

I am using both edgeR and DESeq2 to normalize raw counts (it's not RNA-seq data or 16S amplicon seq data...but it is amplicon seq data). I just need to normalize them before creating a visualization. It's preliminary work; so the parts of these packages that calculate differential expression are not useful to me.

I have two sets of scaling factors (from edgeR using the TMM and RLE methods). My question is what is the correct approach for applying these scaling factors to my raw counts. Is it:

raw count / scaling factor

or

raw count / (library size * scaling factor)

I've been researching these methods and so far I have seen it both ways. I'm still not sure how to just get normalization factors from DESeq2, as I just got that package installed yesterday evening. But I've kept the DESeq2 tag because the question applies to both and if anyone has advice regarding DESeq2 that could be helpful to me and others.

Rookie question: the dispersion calculation would make sense for evaluating DE, not as part of the normalization, right?

Thanks for the help.

R edger deseq2 normalization • 962 views
ADD COMMENTlink written 8 months ago by sovrappensiero10
1
gravatar for Kevin Blighe
8 months ago by
Kevin Blighe37k
Republic of Ireland
Kevin Blighe37k wrote:

To normalise, you do just divide by the size factor (assuming that you have arrived at your size factors in the correct way). This is exemplified in a good example here: Normalization

To obtain the DESeq2 normalisation factors in the first place, you could just first normalise the data in DESeq2 and then use: sizeFactors(dds) This is stated in the vignette: Analyzing RNA-seq data with DESeq2

For dispersion, take a look at my answer here: A: Clarification on how DSEeq2 Dispersion Curve is Generated I am almost certain that dispersion is indeed used for DE analysis.

ADD COMMENTlink modified 8 months ago • written 8 months ago by Kevin Blighe37k
1

Thank you! That was very helpful.

ADD REPLYlink written 8 months ago by sovrappensiero10

@Kevin: Is this method still valid for scale factors generated by upper quartile or scaled median normalization? Are RLE and median of ratios described in your link the same calculation? Same question for median and scaled median methods?

ADD REPLYlink modified 8 months ago • written 8 months ago by user3188830

I cannot say that each normalisation method just involves a division by a particular size factor - each has a different formula that may or may not involve a 'size factor'.

From what I understand, the median ratios method is an extension of RLE, and is currently the method used by DESeq2, as per the link that I gave. For 100% clarification, would suggest re-posting your question on the Bioconductor forum where the DESeq2 developers are more likely to respond.

A good practice would be to calculate the size factors manually and then via DESeq2, and then you'll have empirical evidence of how exactly it works.

ADD REPLYlink written 8 months ago by Kevin Blighe37k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2464 users visited in the last hour