I used to analyze RNAseq data mainly derived from our own samples, which are mostly sequenced in the same batch in the same condition. In this case, both DESeq2 and TMM work very well.
However, when I tried to analyze human data from a large cohort, I saw someone using cqn(Conditional Quantile Normalization) normalization, which corrected the GC content bias between lanes(batches, samples?). From the original CQN article (DOI: 10.1093/biostatistics/kxr054), and a paper from Jonathan K. Pritchard's group(DOI: 10.1093/biostatistics/kxr054, supplementary figure 12), it seems that GC content will affect the result significantly, and the GC bias is sample-based.
I compared my results based on logCPM from TMM (logCPM from limma) and CQN (cqn_result$y+cqn_result$offset), they are similar, but logCPM from TMM normalization are much more obvious and significant than that from CQN normalization.
Searching the available literature, it seems that CQN is not as widely applied as TMM or DESeq2, and not too many people compared the difference between these three methods.
I'm now quite confused about my results, especially on how reliable it is.
Any suggestion is welcome and appreciated.