Question: how to consider batch effect in design matrix?
1
17 months ago by
star190
Netherlands
star190 wrote:

I have two set data, 4 samples with 2 replicates for each one from batch 1 and another 4 samples with 2 replicates from batch 2.

I would like to remove batch effects from these samples and compare different methods together. I have done below commands but face with error:

``````design

samples method batch
L4_rep1                              L4      L    b1
L4_rep2                              L4      L    b1
L6_L8_rep1                        L6_L8      L    b1
L6_L8_rep2                        L6_L8      L    b1
Q5_Q7_rep1                        Q5_Q7      Q    b1
Q5_Q7_rep2                        Q5_Q7      Q    b1
Q3_rep1                              Q3      Q    b1
Q3_rep2                              Q3      Q    b1
co_40d_A                         co_40d co_40d    b2
co_40d_B                         co_40d co_40d    b2
co_60d_A                         co_60d co_60d    b2
co_60d_B                         co_60d co_60d    b2
EB_A                                 EB     EB    b2
EB_B                                 EB     EB    b2
H9_A                                 H9     H9    b2
H9_B                                 H9     H9    b2

design\$=samples <- factor(design\$=samples, levels = c("L4","L6_L8", "Q3", "Q5_Q7","co_40d","co_60d", "EB", "H9"))
design\$method <- factor(design\$method, levels = c("L", "Q", "co_40d","co_60d", "EB", "H9"))
design\$batch <- factor(design\$batch, levels = c("b1", "b2"))

design.matrix <- model.matrix(~0+batch+method,design)

design.matrix

batchb1 batchb2 methodQ methodco_40d methodco_60d methodEB methodH9
L4_rep1          1       0       0            0            0        0        0
L4_rep2          1       0       0            0            0        0        0
L6_L8_rep1       1       0       0            0            0        0        0
L6_L8_rep2       1       0       0            0            0        0        0
Q5_Q7_rep1       1       0       1            0            0        0        0
Q5_Q7_rep2       1       0       1            0            0        0        0
Q3_rep1          1       0       1            0            0        0        0
Q3_rep2          1       0       1            0            0        0        0
co_40d_A         0       1       0            1            0        0        0
co_40d_B         0       1       0            1            0        0        0
co_60d_A         0       1       0            0            1        0        0
co_60d_B         0       1       0            0            1        0        0
EB_A             0       1       0            0            0        1        0
EB_B             0       1       0            0            0        1        0
H9_A             0       1       0            0            0        0        1
H9_B             0       1       0            0            0        0        1

library(edgeR)
data_filter<- count table
edgeR.dgelist = DGEList(data_filter)
edgeR.dgelist_normal = calcNormFactors(edgeR.dgelist)
CommonDisp <- estimateGLMCommonDisp(edgeR.dgelist_normal, design.matrix)
Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset,  : Design matrix not of full rank.  The following coefficients not estimable: methodH9
``````

I like to know whether my design matrix is correct? Also, I like to compare Q method against other methods, would you please help me in making contrast?

modified 17 months ago • written 17 months ago by star190
5
17 months ago by
h.mon27k
Brazil
h.mon27k wrote:

Your method and batch are confounded ( b1 -> L, Q ) and ( b2 -> co_40d, co_60d_A, EB, H9 ), same for sample and batch. Also, for b2, method and sample are the same, and it is not possible to estimate effects for redundant variables. Are samples technical replicates? You could drop samples and keep only method, but in any case batch is still confounded - meaning you can't independently estimate batch and methods effects.

Try searching for `Design matrix not of full rank`, there will be plenty of very good posts explaining the causes and how to solve it. For example:

https://support.bioconductor.org/p/68092/

Design matrix not of full rank.

https://support.bioconductor.org/p/80408/