Newbie question probably, but I want to know if it makes sense to do batch effect correction (via SVA's ComBat) using only parts of a RNA-seq expression matrix. The problem is that the matrix I'm working with can't be loaded into my R workspace as a whole (too big), what I was thinking of doing was to partition the matrix into gene based subsets. So I would simply set the files to a manageable amount of rows and process them 1 by 1.
Since all samples always appear on each partition, then I would be able to use the same adjustment variables matrix and batch IDs vector for all partitions. Never done batch effect corrections before so I don't know if not using the whole matrix renders the method pointless. My guess is no since from what I understood from the method's paper (to be honest I didn't get the fine details of the method) the expression of one gene doesn't interact with the expression of another at the moment of the calculations, so it shouldn't be essential to process all genes in one go, however I definitely need someone else's advice for knowing that I'm not messing up here.