Error in differential analysis for samples with different time points
1
1
Entering edit mode
2.7 years ago
Biologist ▴ 230

I have 4 samples. 2 control and 2 gene_oe (over expression) samples.

I wanted to do differential analysis between Gene_OE vs Control samples. I have the samples column data like following:

coldata:

Samples Type    Time
SampleA Control Day1
SampleB Control Day2
SampleD Gene_OE Day1
SampleE Gene_OE Day2


Using edgeR I did like following:

library(edgeR)
group <- factor(paste0(coldata$Type))  And created design matrix like following: design2 <- model.matrix(~ 0 + group + coldata$Time)
desgin2

Control Gene_OE day1 day2
1       1        0    0    0
2       1        0    1    0
3       0        1    0    0
4       0        1    1    0


I see some warning message :

y <- estimateDisp(y, design2, robust=TRUE)
Warning message:
In estimateDisp.default(y = y$counts, design = design, group = group, : No residual df: setting dispersion to NA  And an error like below: fit <- glmQLFit(y, design2, robust=TRUE) Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, : Design matrix not of full rank. The following coefficients not estimable: day2  What could be the reason for this error? And how to resolve this error? RNA-Seq r edger differential analysis • 1.3k views ADD COMMENT 1 Entering edit mode 2.7 years ago h.mon 33k The design matrix is not full rank because you have only one sample (no biological replicates) per combination of treatment (type+time). You may either drop day from the analysis, or add more samples per treatment. ADD COMMENT 0 Entering edit mode But If my coldata looks like below, I don't see any error: coldata: Samples Type Time SampleA Control Day1 SampleB Control Day2 SampleC Control Day3 SampleD Gene_OE Day1 SampleE Gene_OE Day2 SampleF Gene_OE Day3 design2 <- model.matrix(~ 0 + group + coldata$Time)
desgin2

Control Gene_OE Day2 Day3
1       1        0    0    0
2       1        0    1    0
3       1        0    0    1
4       0        1    0    0
5       0        1    1    0
6       0        1    0    1


Is this right?

1
Entering edit mode

The above has the very same problem as your original post. You need more replicates per treatment - each day has only one sample, you need more per same day.

0
Entering edit mode

Could you please tell me whether the above way it is right or I should add more samples per treatment?