Batch effect problem DEG, DESseq2
1
2
Entering edit mode
19 months ago
maria2019 ▴ 140

I have 9 samples for bulk RNA-seq analysis with DESeq2. I have 2 conditions and 2 batches. Below is the sample table.

My design is based on both batch and condition and the PCA plot shows that C1,C2, and C3 samples are separated. below is the code and the PCA plot.

dds <- DESeqDataSetFromTximport(txi.rsem, sampleTable, ~Batch+condition)
dds <- DESeq(dds)
vsd <- vst(dds, blind=FALSE)
plotPCA(vsd, intgroup=c("condition", "Batch"))+ geom_text(aes(label=name),vjust=2)


I know that it is recommended not to remove the batch effect but I was trying to check and see what would be the results after removing the batch effect. I again see that the control samples C1,C2,C3 are not clustered with the other control C4 from another batch. Below is the code and PCA plot after removing the batch effect.

mat <- assay(vsd)
mat2 <- limma::removeBatchEffect(mat, vsd\$Batch)
assay(vsd) <- mat2
plotPCA(vsd, intgroup=c("condition", "Batch")) + geom_text(aes(label=name),vjust=2)


My question is does it make sense to go further and do the DEG analysis? or my samples are not clustered/separated rationally and the results would not make any sense? I think I cannot remove C4 because that is the only connection between the samples of different batches (if that makes sense!). I was hoping that C1,C2, and C3 would cluster together with C4 and then the comparison between the Control and Treatment of different batches would be accurate.

DESeq2 batch-effect RNA-seq DEG • 1.1k views
1
Entering edit mode

Please search the site for "batch effect" and read posts. Your batches are unevenly distributed, with all treatment samples found in the same batch. If you were to adjust for batch effect, DESeq2 will not be able to differentiate changes owing to treatment and changes owing to batch effect, even with the lone control in B2. You should definitely use batch as a covariate, but results probably will not have a lot of statistical power.

0
Entering edit mode

Thank you for your response. I am reading other posts. based on your comment, do you suggest removing the batch effect step or just adjusting results based on batch effect and including batch as a covariate? (I assume the second one?)

1
Entering edit mode

"Removing" batch effect would be a lot more damaging that accounting for it as a covariate, especially in this case where your dataset might end up losing most of the significant differences even before DE analysis begins. Covariate assumption is the lesser of two evils here, but the true problem is experiment design.

2
Entering edit mode

Will DESeq allow batch to be a covariate if it's totally confounded in one condition? The second example seems applicable

http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#model-matrix-not-full-rank

0
Entering edit mode

Good catch - DESeq cannot account for the batch effect here. Looks like OP is going to have to operate without accounting for batch effects.

7
Entering edit mode
19 months ago
ATpoint 49k

Batch correction requires replicates of each group in both batches. Your experiment is fully confounded with all treatments in B2. Sorry to say but this is simply an unfortunate experimental design and I do not think any in silico methods can save you from that.

Traffic: 2412 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.