Question

Defining batch-effects in a DESeq2 design

2

Entering edit mode

7.9 years ago

Tobias.Wohland ▴ 70

Hi,

I have a couple of questions regarding my RNA-Seq experiment but I will start with a hopefully easy one.

I have three batch-effects in my design. Unfortunately, one of these effects contain two batches with only one sample. I found one post here, which recommended not to use such batch-effects (see: here in the next-to-last comment). My question is, is this right and can someone explain why (it makes sense but I can not explain why).

My second question also refers to the batch-topic. Let's assume I can use two of my batch-variables (these are factor-variables with at least 3 samples per batch), how do I define the design for DESeq2?

I read in the same post, referenced above, that one should put the variable of interest to the end of the design-formula like:

~batch1 + batch2 + condition

Again the question, whether this is correct?

Many thanks for your help. If this is clarified I will continue with my other questions.

Thanks!

RNA-Seq • 3.0k views

ADD COMMENT • link updated 7.9 years ago by Devon Ryan 104k • written 7.9 years ago by Tobias.Wohland ▴ 70

score 1 · Answer 1 · 2016-05-27

1

Entering edit mode

7.9 years ago

Devon Ryan 104k

I'm not sure if the next-to-last comment you referred to was from me or Martombo, but the reply is the same in either case. If you have a couple batches that only have one sample each then just remove that sample, since it's really not adding anything.

Regarding your second question, yes, that's how your design would look.

ADD COMMENT • link 7.9 years ago by Devon Ryan 104k

0

Entering edit mode

Hi Devon, thanks for the quick reply. By the way, I'm referring to your comment. Additionally I also found the answer to question two, in the DESeq2-manual. I'm sorry. Regarding the number of samples in a batch: Can you clarify what you mean with removing? Removing from the whole analysis? But then I will lose important information, will I? In my experiment I compare two conditions with 6 biological replicates in each group. My idea was not to consider the related batch effect in the design.

ADD REPLY • link 7.9 years ago by Tobias.Wohland ▴ 70

0

Entering edit mode

Yes, remove it from the whole analysis. Such samples don't contribute anything to the analysis since they can only be used to calculate the batch effect, which you don't care about. If you don't include the batch in the model then certainly go ahead and include everything. Just have a look at some clustering to ensure that the batches don't have an appreciable effect.

ADD REPLY • link 7.9 years ago by Devon Ryan 104k