Hi I'm trying to analyze RNA-seq data of multiple ICGC projects. I want to remove batch effects of projects with SVA package and algorithm before employing DEseq2 for finding differentially expressed genes and after that network analysis. I have these questions:
1) DEseq2 uses only raw read counts, and it shouldn't be normalized in any way. How can I use SVA output as DEseq2 input? I know that DEseq2 has the ability to get confounding variable (project codes) in its formula but I don't wan't to use it.
2) Should I log transform my raw read counts before using SVA?
3) Can I use microarray and RNA-seq in a single analysis after removing its batch effects by SVA and normalization? Is it a valid analysis? If not, what is the best way to analyze the two datasets as a single dataset? about half of my samples are only in microarray. none of samples are repeated in datasets.
thanks
thanks Kevin, it made a bit clear. My third question is a critical problem for me. can you help me on that? do you any method or pipeline like: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3617154/
Regarding the third question, they seem to have used SVA for the purposes of integrating the microarray and RNA-seq. I wish that they would provide the exact code that they used, though. The methods 'Multivariate analysis on combined data' is not clear.