Suppose I have a matrix of gene counts, whose rows represent genes, and columns represent samples. For RNA-seq packages such as edgeR and DEseq2, the counts are normalized and standardized before DE analysis.
My question is about the standardization itself. After normalizing each column to adjust for sequencing depth and library size, I calculate the log2(normalized counts + 1). Then, is each column further standardized (to get mean = 0 and standard deviation = 1), or is the standardization done on each row only?
Thank you for your help.
You don't have to do anything to the counts, just give them raw to DESeq2/edgeR.
No need of standardization, just give raw data. And, Deseq2 won't accept even your normalized data.
I just want to understand what the packages do to the raw counts in the background.
Do these packages calculate a normalization factor (like what Damian Kao said), scale the counts by the factor and do DE test? Do they standardize the normalized counts before DE test?
Then you should probably read the paper and vignette