Can we do between-sample normalization like deseq/TMM on RPKM?
1
0
Entering edit mode
7.2 years ago

Hi fellow biostars,

I have RPKM for 20 samples(10 disease and 10 control). I'm trying to build a classification model, so I need to do between-sample normalization. I know that I could do deseq/TMM on raw read counts, but is it okay to do deseq/TMM on RPKM? Thanks a lot!

Update: I found that deseq doesn't support continuous data like RPKM. So I guess my question is: is there any between-sample normalization methods that I could use on RPKM data? Thanks!

RNA-Seq genome R • 2.6k views
ADD COMMENT
1
Entering edit mode

The problem is not the normalization, you can normalize RPKM in the same method DESeq uses. The DESeq model assumes the input is number of reads, it's pretty useless without it. Can't you get the raw reads?

ADD REPLY
1
Entering edit mode
7.2 years ago
agoel ▴ 30

RNA-seq expression values can be enumerated by two methods - Count-based or, RPKM/ FPKM based

In the case of the latter, the values are estimates and hence in decimals. Since they are in decimals, robust statistical procedures aren't available to them. Whereas, Count-based data are whole numbers (since they are count of read-depths) and rigorous statistical procedures (like DESEq, EdgeR, Limma/ Voom etc.) are available.

In summary, to perform a proper cross-sample normalisation, that will remove batch-effects if present, Count-based data would be needed.

ADD COMMENT

Login before adding your answer.

Traffic: 2900 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6