I recently started comparing two RNA-seq data from different conditions (vehicle control vs treatment). Unfortunately, they were prepped by different researches and also measured at different dates, so they have huge batch effects, as seen in the PCA plot (gray: day1 control, black: day2 treatment, red: day1 positive control, blue: day2 positive control).
We tried to correct the batch effect by some R packages such as sva, but found it just decompose the differences of both batch effect and treatment conditions. I guess, if we have mixed both control and treatment conditions in each run, we would have been able to correct the batch effect and able to see the possible difference between the control and treatment conditions.
However, fortunately we measured the identical control samples at each run. Now I am wondering, is there any method to correct batch effect by referring to same control samples from different runs?
Thank you in advance,