How to evaluate different RNA-seq normalization method?
1
0
Entering edit mode
4.1 years ago
dz2353 ▴ 120

Hi there,

Is there any standard or strategy to evaluate the normalization method of reads count from RNA-seq data? I would like to compare the same gene expression level between different samples. I know I do not need to take care of the exon length, just normalize the sequencing depth. I choose some methods including CPM, UP, TMM, DESeq2 and deconvolution methods. But I do not know how to evaluate them? I plot some statistical elements like the coefficient of variance vs mean, variance. But it seems there is not too much difference. So I was wondering if there any way to help me understand which method is best? Thank you in advance for any answer, idea, and suggestion.

RNA-Seq • 1.3k views
ADD COMMENT
1
Entering edit mode

I suggest you read one of the many benchmarking papers which compare these methods, available via PubMed. As usual there is no "best". TMM and RLE (the one from DESeq2) typically perform comparable and well. Honestly I would not spend too much time on these comparisons as benchmarking is an art of its own and you really need a sophisticated setup to extract meaningful information. Check available papers if you really want to repeat these evaluations. Better spend time on the interpretation of the results than on benchmarking yourself. Both edgeR and DESeq2 are perfectly fine and accepted for RNA-seq. What you should ask yourself is if your data violate the assumption of the normalization which is that a large number of genes does not change between conditions.

ADD REPLY
0
Entering edit mode

Thank you for your reply. I agree with you. There is no need to take too much time on normalization. For the assumption you mentioned, can I understand it as most of the genes between samples show similar expression levels?

ADD REPLY
0
Entering edit mode

The median ratio normalization in DESeq2 doesn't have as strong of an assumption that "most genes don't change". See Michael Love's post on https://support.bioconductor.org/p/61604/

But, that said, if most genes indeed show similar expression levels between samples, median ratio works well in capturing the size factor differences between samples.

ADD REPLY
0
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 2133 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6