I have two batches of RNAseq data, one containing all of my cases and one containing all of my controls. Assume that I cannot alter the status of these batches to scatter my cases and controls among both batches, which would obviously be better.
What if I took RNA from a cell line (neither case nor control), split it into two aliquots, and added it into each batch? Are there methods that will allow me to model the batch effects using this technical replicate, then apply correction to the rest of the samples?
Seems like RUVseq tried to do something similar using ERCC probes (with only moderate success), but that's slightly different, since it's explicitly defining a set of ERCC "genes" to use for modeling the batch effects.