Hi All,
Recently I am trying to analyze some RNA sequencing data and perform the differential expression analysis. Since the sequencing data were generated in different times (experiments), I am worrying about the potential batch effects and would like to find a way to figure out and remove it (remove is more important!)
I tried to process the data in two different ways:
- Obtain the RSEM values and perform the differential expression analysis, in this way, can anyone kindly provide a good tool to apply on RSEM values to remove the batch effect?
- Obtain the htseq raw counts and use Deseq2 to perform the differential expression analysis. I saw previously another discussion about about justifying the batch effect by adding the 'batch' in the 'design' [design ~ batch+treatment] parameter in Deseq2. Besides this, can anyone kindly provide another good way to remove the batch effect?
Thanks for the help in advance.
When you cluster the samples, or do PCA analysis, do you see batch effects as confounding ? I am just curious.
Yes, I did PCA, but it did not really show very obvious batch effect, may because I just have very few sample size in each batch (from 2 to 5).