Using ColSums vs sizeFactors in read count normalization
3
0
Entering edit mode
4.4 years ago
tpaboh • 0

Hello,

I have a RNAseq data set with 10 samples. I noticed that I get slightly different fpkm values when I use colSums and sizeFactors for read count normalization. (See the following figure for colSums vs sizeFactors distribution.)

My question is how to figure out which library size data use use for normalization? Does it depend on the personal preference?

Really appreciate your comments on this.

Thank you!

RNA-Seq R DESeq2 fpkm • 2.1k views
ADD COMMENT
0
Entering edit mode

Thank you very much you all for the material. This is very helpful!

ADD REPLY
3
Entering edit mode
4.4 years ago

There are already many questions adressing this issue. Search the terms median ratios method (the normalization used to calculate the size factors in DESeq, also called RLE) or between samples normalization.

In short, the median of ratios is a more robust normalization metric. In contrast, metrics based on total reads count (colSums as you said) are very sensitive to highly expressed genes, which can skew the normalization for all the other genes. You can read this review for instance that nicely illustrate the issue.

ADD COMMENT
2
Entering edit mode
4.4 years ago

For finding DE genes, you should follow the regular DESeq2 protocol and not use FPKM.

For visualization purposes, 'correct' normalization is a little less important. Learn how each normalization works, and decide which way's assumptions better fit your data.

ADD COMMENT
2
Entering edit mode
4.4 years ago
ATpoint 81k

What you refer to is naive per-million normalization vs RLE from DESeq2. Please read the DESeq paper which discusses why this technique exists and why it outperforms naive methods. Also, search pubmed for benchmarking papers towards normalization methods. They will all show that per-million is inferior. For a quick introduction see:

ADD COMMENT

Login before adding your answer.

Traffic: 2522 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6