While performing between sample normalization using the TMM approach in edgeR (since TMM approach relies on log2FC between groups defined in the experimental design matrix) how should RNA-Seq data with multple groups be defined. For instance, if my data contains the following groups: control, control, treatment1, treatment1, treatment1, treatment2, treatment2, should I define my experimental design to contain all these groups or should I define them separately while performing normalization?
My quick check on the calcNormFactors function in R (https://github.com/cran/TCC/blob/master/R/calcNormFactors.R) showed that if the provided experimental design matrix consists of more than one group (columns>1) then the edgeR approach is applied, in this case TMM if selected. However, it is not clear whether the function is able to detect multiple groups in the design matrix.
This normalization method is group-independent. It will be applied to all samples present in the DGEList regardless of design. By the way, the link you added directs to a package called TCC. It has nothing to do with edgeR. Be careful with these kinds of things, names of functions can be shared by many packages. edgeR code can be reviewed by downloading the source package from https://bioconductor.org/packages/release/bioc/html/edgeR.html or by consulting the help
?edgeR::calcNormFactors
Thanks a lot this was very helpful!