Question: RNA-seq batch effect removal before DEseq2
gravatar for amin.ghareyazi
18 months ago by
amin.ghareyazi0 wrote:

Hi I'm trying to analyze RNA-seq data of multiple ICGC projects. I want to remove batch effects of projects with SVA package and algorithm before employing DEseq2 for finding differentially expressed genes and after that network analysis. I have these questions:

1) DEseq2 uses only raw read counts, and it shouldn't be normalized in any way. How can I use SVA output as DEseq2 input? I know that DEseq2 has the ability to get confounding variable (project codes) in its formula but I don't wan't to use it.

2) Should I log transform my raw read counts before using SVA?

3) Can I use microarray and RNA-seq in a single analysis after removing its batch effects by SVA and normalization? Is it a valid analysis? If not, what is the best way to analyze the two datasets as a single dataset? about half of my samples are only in microarray. none of samples are repeated in datasets.


rna-seq R • 1.1k views
ADD COMMENTlink modified 18 months ago • written 18 months ago by amin.ghareyazi0
gravatar for Kevin Blighe
18 months ago by
Kevin Blighe69k
Republic of Ireland
Kevin Blighe69k wrote:

As you know, the recommended approach is to include batch as a variable in your design formula. If you have identified other surrogate variables via SVA, then include those in your design formula. These will then be accounted for when the statistical model is fit to the data.

If you want to remove the batch effect from your normalised + transformed data for downstream analyses, then you can do this via, for example, removeBatchEffect() (from limma). It is performed on the transformed data itself, but using the information from your design formula.


ADD COMMENTlink written 18 months ago by Kevin Blighe69k

thanks Kevin, it made a bit clear. My third question is a critical problem for me. can you help me on that? do you any method or pipeline like:

ADD REPLYlink written 18 months ago by amin.ghareyazi0

Regarding the third question, they seem to have used SVA for the purposes of integrating the microarray and RNA-seq. I wish that they would provide the exact code that they used, though. The methods 'Multivariate analysis on combined data' is not clear.

ADD REPLYlink written 18 months ago by Kevin Blighe69k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1352 users visited in the last hour