Two rounds of batch effect removal
1
0
Entering edit mode
2.0 years ago
Kazuna • 0

Hi, we are currently doing RNAseq analysis and would like to remove batch effects. I would like to compare my data with previously published data, but to begin with, our data needs to have the batch effects removed. The design (designated as "samples") is as follows.

Sample  RNA_extraction  Experiment  treatment
1   1   ours    A
2   1   ours    A
3   1   ours    A
4   1   ours    A
5   1   ours    A
6   1   ours    B
7   1   ours    B
8   1   ours    B
9   1   ours    B
10  1   ours    B
11  2   ours    C
12  1   ours    C
13  2   ours    C
14  2   ours    C
15  2   ours    C
16  1   ours    C
17  3   ref A
18  3   ref A
19  3   ref A
20  3   ref D
21  3   ref D
22  3   ref D
#two batches are included
group <- samples$treatment
design <- model.matrix(~ group) 
batch <- samples$Experimenter
batch2 <- samples$RNA_extraction

rldData <- dat %>%
  removeBatchEffect(batch = batch, batch2 = batch2,
                    design = design)

When I ran the code above, batch2 was ignored. I thought it was because both $RNA_extraction and $Experiment are different from the other for the previously reported data. So I set batch to samples$Experimenter only, and I could see the batch effect in my data remains in PCA (data attached below). Therefore, my question is, is it possible to first remove the batch effect in my data and then remove the batch effect which appears by comparison with previously reported data?

#tonly one batch is included
group <- samples$treatment
design <- model.matrix(~ group) 
batch <- samples$Experimenter

rldData <- dat %>%
  removeBatchEffect(batch = batch, 
                    design = design)

PCA Black:A, Blue:B, Pink: C, Green: D

Thank you in advance! (I'm not a native English speaker, so please forgive me if I'm not clear.)

DESeq2 removeBatchEffect • 780 views
ADD COMMENT
0
Entering edit mode

Are these two completely independent datasets?

ADD REPLY
0
Entering edit mode

Yes, they are.

ADD REPLY
1
Entering edit mode
2.0 years ago
Gordon Smyth ★ 7.0k

You refer to both "Experiment" and to "Experimenter". I have going to assume that is just a typo and they are the same thing.

In your sample information, the variable called "Experiment" is completely confounded with the variable called "RNA_extraction". Hence adjusting for "RNA_extraction" will already adjust for "Experiment" as well. You should be setting batch <- samples$RNA_extraction and not using batch2. You cannot do two rounds of adjustment, that is not meaningful.

ADD COMMENT
0
Entering edit mode

I understand. Thank you so much for your kind reply!

ADD REPLY

Login before adding your answer.

Traffic: 1485 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6