Constructing design matrix for batch effect, without intercept.
0
0
Entering edit mode
4.3 years ago
carl.h ▴ 20

I am figuring out to how construct a design matrix for a multi variate experiment.

I am using edgeR to analyse the gene expression data of 32 different samples. There are 16 gene knock-outs (KO) and16 wildtype controls (WT). The experiment is constructed with two time points ( 1h and 24h) and with two treatments ( TGF-beta and control). The research question is which genes are DE between the KO and the WT at the different conditions. So I will compare each sample of TGF treated separately and each control separately in the different time points. For doing this I first constructed a design matrix without intercept since I didn't have any reasonable baseline to compare with. For example I wanted to look at the expression at 24 hours in either TGF or without. The design matrix:

my.design <- model.matrix(~ 0  + design, data=matrix)

designKO1hControl designKO1hTGF designKO24hControl designKO24hTGF designWT1hControl designWT1hTGF designWT24hControl designWT24hTGF
1                  0             0                  0              0                 0             1                  0              0
2                  0             0                  0              0                 0             0                  0              1
3                  0             0                  0              0                 1             0                  0              0
4                  0             0                  0              0                 0             0                  1              0
5                  0             1                  0              0                 0             0                  0              0
6                  0             0                  0              1                 0             0                  0              0
7                  1             0                  0              0                 0             0                  0              0
8                  0             0                  1              0                 0             0                  0              0
9                  0             0                  0              0                 0             1                  0              0
10                 0             0                  0              0                 0             0                  0              1
11                 0             0                  0              0                 1             0                  0              0
12                 0             0                  0              0                 0             0                  1              0
13                 0             1                  0              0                 0             0                  0              0
14                 0             0                  0              1                 0             0                  0              0
15                 1             0                  0              0                 0             0                  0              0
16                 0             0                  1              0                 0             0                  0              0
17                 0             0                  0              0                 0             1                  0              0
18                 0             0                  0              0                 0             0                  0              1
19                 0             0                  0              0                 1             0                  0              0
20                 0             0                  0              0                 0             0                  1              0
21                 0             1                  0              0                 0             0                  0              0
22                 0             0                  0              1                 0             0                  0              0
23                 1             0                  0              0                 0             0                  0              0
24                 0             0                  1              0                 0             0                  0              0
25                 0             0                  0              0                 0             1                  0              0
26                 0             0                  0              0                 0             0                  0              1
27                 0             0                  0              0                 1             0                  0              0
28                 0             0                  0              0                 0             0                  1              0
29                 0             1                  0              0                 0             0                  0              0
30                 0             0                  0              1                 0             0                  0              0
31                 1             0                  0              0                 0             0                  0              0
32                 0             0                  1              0                 0             0                  0              0


So the matrix is modelled after each condition to be able to compare the difference between them.

My problem comes now since I have 4 biological replicates of the KO and WT. Each were knocked-out individually and it shows on the MDS-plot that the replicates group together. I would like to do like in the edgeR handbook chapter 3.4.3. Where they remove the batch effect in these replicates. So I constructed a similar design matrix:

my.design <- model.matrix(~ batcheffect  + design, data=matrix)


This, as I understand, would use one of the factors as intercept, thus comparing all other samples to that, which I would like to avoid. It would be of great help if someone could help me solving this. Or if you have a better suggestion organizing the design matrix that too would be of great help.

RNA-Seq edgeR design matrix • 1.8k views