Why is the scaling factor always million while dealing with RNA-Seq expression?
7.7 years ago
Sandeep ▴ 260

I have a basic conceptual question on RNA-Seq expression units. While quantifying expression using different methods, why is that we always scale by a million and not any other number? I have seen the same in RPKM, FPKM, TPM and CPM.

Can anyone explain the rationale behind it?? Why not hundred thousand?

Thanks

RPKM TPM FPKM RNA-Seq CPM • 2.1k views
Might be because the scale we use in NGS data is million ( 'n' million reads), the scaling factor used is million ( per million reads).

7.7 years ago

1 million is a nice round number and the range of resulting values is more or less convenient.

Is that it? Isnt there any logical explanation as to why 1 million was chosen to be most convenient and nice round number? We could probably use 10^5.

I guess there is something more to it. Probably we need a statistician to answer the same.

You can ask a statistician and you'll get the same answer. There are many arbitrary values that we use every day.

Yes you could by how would you call it then? "Reads Per one hundred thousand ?". RPM was used because it make the number looks nicer, easy to define/pronounce.

There's nothing more to it. Why do we give distances in kilometers and not megameters? A mixture of convenience for the quantities we typically report and historical inertia.