I often read in articles that read count RNASeq expression matrices (rows=genes / columns=samples) have been fpkm-normalized, median-centered, log2 transformed.
In which order these steps are performed?
When you do median-centering, you substract from the expression value the median calculated per samples (column) or per gene (row)?
In order to avoid producing NaNs during the log2 transformation, at which step do we add +1 to the expression values (to get > 0 values)?
I see things this way:
Add +1 to all read counts
normalise read counts (into FPKM, TPM, whatever...)
substract the median of the sample (column) to each expression values
Is it correct?