Question

EdgeR-GLMs correcting for batch effects

6

Entering edit mode

9.9 years ago

Matina ▴ 250

Hi all,

I have RNA-seq data for 3 conditions (control, breast, endometrial). Some of the samples where collected and sequenced in two different batches. I want to correct for batch effects using EdgeR and GLMs. I did read the EdgeR vignette, and specifically section 4.5 but I'm still a bit confused with the contrasts matrix and if I'm interpreting the results correctly.

This is my code:

design = model.matrix(~group+batch, data=d$samples)
d$samples$group = relevel(d$samples$group, ref="control")
rownames(design) = colnames(d)

d <- estimateGLMCommonDisp(d, design, verbose=TRUE)
d <- estimateGLMTrendedDisp(d, design)
d <- estimateGLMTagwiseDisp(d, design)

fit = glmFit(d, design)
lrt = glmLRT(fit, contrast=c(0,0,-1,0))

This is my design matrix

            intercept   groupbrc    groupendo   batch2
sample.1    1           0           1           0
sample.2    1           0           1           0
sample.3    1           0           1           0
sample.4    1           0           0           0
sample.5    1           1           0           0
sample.6    1           1           0           0
sample.7    1           1           0           1
sample.8    1           1           0           1

Basically what i want to do is pairwise comparisons between the treatments (brc vs normal, endo vs normal, brc vs endo) and accounting for batch effects at the same time. I understand that the (intercept) corresponds to the normal condition but what I don't understand is what the last column (batch 2) means, and if i should include it in my contrasts. The contrasts I've used are the following, i get DE genes but I'm not sure if I'm accounting for batch effects correctly

Brc VS normal lrt = glmLRT(fit, contrast=c(0,1,0,0))

Endo VS Normal lrt = glmLRT(fit, contrast=c(0,0,1,0))

Brc VS Endo lrt = glmLRT(fit, contrast=c(0,1,-1,0))

Thank you!

edgeR RNA-Seq Batch-Effects • 6.9k views

ADD COMMENT • link updated 2.6 years ago by Ram 43k • written 9.9 years ago by Matina ▴ 250

Ram · Accepted Answer · 2014-05-30

3

Entering edit mode

9.9 years ago

Devon Ryan 104k

Brc vs. normal: lrt = glmLRT(fit, coef=2)
Endo vs. normal: lrt = glmLRT(fit, coef=3)

I would do Brc vs. Endo the same as you did, since you do want to contrast two model coefficients. There's no need to use a contrast when you're just wanting to test whether a model coefficient is itself significant.

ADD COMMENT • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Devon Ryan 104k

0

Entering edit mode

Thank you for your reply.

Is there a difference between

Brc vs. normal: lrt = glmLRT(fit, coef=2)
Endo vs. normal: lrt = glmLRT(fit, coef=3)

and

Brc VS normal: lrt = glmLRT(fit, contrast=c(0,1,0,0))
Endo VS Normal: lrt = glmLRT(fit, contrast=c(0,0,1,0))

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Matina ▴ 250

0

Entering edit mode

They're the same, it's just easier to follow when dealing directly with coefficients since you're not actually trying to contrast any of them.

ADD REPLY • link 9.9 years ago by Devon Ryan 104k