Question: RNA-seq batch effect removal before DEseq2
0
gravatar for amin.ghareyazi
8 months ago by
amin.ghareyazi0 wrote:

Hi I'm trying to analyze RNA-seq data of multiple ICGC projects. I want to remove batch effects of projects with SVA package and algorithm before employing DEseq2 for finding differentially expressed genes and after that network analysis. I have these questions:

1) DEseq2 uses only raw read counts, and it shouldn't be normalized in any way. How can I use SVA output as DEseq2 input? I know that DEseq2 has the ability to get confounding variable (project codes) in its formula but I don't wan't to use it.

2) Should I log transform my raw read counts before using SVA?

3) Can I use microarray and RNA-seq in a single analysis after removing its batch effects by SVA and normalization? Is it a valid analysis? If not, what is the best way to analyze the two datasets as a single dataset? about half of my samples are only in microarray. none of samples are repeated in datasets.

thanks

rna-seq R • 437 views
ADD COMMENTlink modified 8 months ago • written 8 months ago by amin.ghareyazi0
3
gravatar for Kevin Blighe
8 months ago by
Kevin Blighe56k
Kevin Blighe56k wrote:

As you know, the recommended approach is to include batch as a variable in your design formula. If you have identified other surrogate variables via SVA, then include those in your design formula. These will then be accounted for when the statistical model is fit to the data.

If you want to remove the batch effect from your normalised + transformed data for downstream analyses, then you can do this via, for example, removeBatchEffect() (from limma). It is performed on the transformed data itself, but using the information from your design formula.

Kevin

ADD COMMENTlink written 8 months ago by Kevin Blighe56k

thanks Kevin, it made a bit clear. My third question is a critical problem for me. can you help me on that? do you any method or pipeline like: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3617154/

ADD REPLYlink written 8 months ago by amin.ghareyazi0

Regarding the third question, they seem to have used SVA for the purposes of integrating the microarray and RNA-seq. I wish that they would provide the exact code that they used, though. The methods 'Multivariate analysis on combined data' is not clear.

ADD REPLYlink written 8 months ago by Kevin Blighe56k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2278 users visited in the last hour