Question: Regarding Normalization Of Rna-Seq Data: Use People Total Number Of Reads Per Lane Or Total Number Of Mapped Reads Per Lane?
1
gravatar for Steffi
7.5 years ago by
Steffi560
Germany
Steffi560 wrote:

As I am not sure that all people are aware of the difference...

normalization rna • 2.9k views
ADD COMMENTlink written 7.5 years ago by Steffi560
1
gravatar for Ido Tamir
7.5 years ago by
Ido Tamir4.9k
Austria
Ido Tamir4.9k wrote:

I think every sane person is aware of the difference. RPKM were defined in Mortazavi as: "transcript levels in reads per kilobase of exon model per million mapped reads (RPKM)". It does not make sense to use the total number of reads. This it the most simple way to normalize RNA-Seq data. More advanced methods are being developed (eg. cqn and references therein - not used yet by myself).

It does not make any sense to use the number of reads that come out of the machine because this includes artifacts (adapter dimers), reads of low quality with errors that don't map to the genome etc....

I would also take care of reads (e.g. rRNA) that vary between preparations and can be mapped to the genome uniquely and the effect this has on RPKM values.

ADD COMMENTlink written 7.5 years ago by Ido Tamir4.9k

Agree. It does not make any sense to normalize per number of reads coming from a lane (same for fractions of that if that is you do multiplexing). Normalize by number of mapped reads or use more sophisticated methods like RPKM.

ADD REPLYlink written 7.5 years ago by Konrad690

I am not so sure that normalizing by total number of mapped reads is that sensible. What you really want to do is to account for different sequencing depths in multiple samples. Normalizing by total number of mapped reads implies that one biases the sequencing depth estimate by the mapping choice.

I am not happy with normalizing by total number of reads either. I quite like the approach of DESeq where the use the idea of a virtual reference sample.

ADD REPLYlink written 7.5 years ago by Steffi560

I did not say its sensible but the most simple one. the linked paper gives references to other methods.

ADD REPLYlink written 7.5 years ago by Ido Tamir4.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1028 users visited in the last hour