Question: How to normalize median-center log-transform an expression matrix
0
gravatar for user31888
5 months ago by
user3188860
United States
user3188860 wrote:

Hi,

I often read in articles that read count RNASeq expression matrices (rows=genes / columns=samples) have been fpkm-normalized, median-centered, log2 transformed.

In which order these steps are performed?

When you do median-centering, you substract from the expression value the median calculated per samples (column) or per gene (row)?

In order to avoid producing NaNs during the log2 transformation, at which step do we add +1 to the expression values (to get > 0 values)?

I see things this way:

  1. Add +1 to all read counts

  2. normalise read counts (into FPKM, TPM, whatever...)

  3. substract the median of the sample (column) to each expression values

  4. log transform.

Is it correct?

Thanks !

rna-seq R • 673 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by user3188860
1

Usually the log 2 transformation and the RPKM calculation go in one go, see for example the rpkm() function in edgeR.

The next step is the median centering, but in my opinion for heatmaps it is more common to use z-scores.

ADD REPLYlink written 5 months ago by Benn6.9k

Ok, I see. The median centering is done by columns (samples), right?

ADD REPLYlink written 5 months ago by user3188860

Well it depends on what you exactly mean by that. You should take per row the median over all samples, is that what you meant?

ADD REPLYlink written 5 months ago by Benn6.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 605 users visited in the last hour