**160**wrote:

Hi all,

I have RNA-seq data for 3 conditions (control, breast, endometrial). Some of the samples where collected and sequenced in two different batches. I want to correct for batch effects using EdgeR and GLMs. I did read the EdgeR vignette, and specifically section 4.5 but I'm still a bit confused with the contrasts matrix and if i'm interpreting the results correctly.

This is my code:

`design = model.matrix(~group+batch, data=d$samples)`

d$samples$group = relevel(d$samples$group, ref="control")

rownames(design) = colnames(d)

`d <- estimateGLMCommonDisp(d, design, verbose=TRUE)`

d <- estimateGLMTrendedDisp(d, design)

d <- estimateGLMTagwiseDisp(d, design)

`fit = glmFit(d, design)`

lrt = glmLRT(fit, contrast=c(0,0,-1,0))

This is my design matrix

intercept | groupbrc | groupendo | batch2 | |

sample.1 | 1 | 0 | 1 | 0 |

sample.2 | 1 | 0 | 1 | 0 |

sample.3 | 1 | 0 | 1 | 0 |

sample.4 | 1 | 0 | 0 | 0 |

sample.5 | 1 | 1 | 0 | 0 |

sample.6 | 1 | 1 | 0 | 0 |

sample.7 | 1 | 1 | 0 | 1 |

sample.8 | 1 | 1 | 0 | 1 |

Basically what i want to do is pairwise comparisons between the treatments (brc vs normal, endo vs normal, brc vs endo) and accounting for batch effects at the same time. I understand that the (intercept) corresponds to the normal condition but what i don't understand is what the last column (batch 2) means, and if i should include it in my contrasts. The contrasts I've used are the following, i get DE genes but I'm not sure if I'm accounting for batch effects correctly

Brc VS normal `lrt = glmLRT(fit, contrast=c(0,1,0,0))`

Endo VS Normal` `

`lrt = glmLRT(fit, contrast=c(0,0,1,0))`

Brc VS Endo `lrt = glmLRT(fit, contrast=c(0,1,-1,0))`

Thank you!

**88k**• written 4.8 years ago by Matina •

**160**