Question: Error in differential analysis for samples with different time points
1
gravatar for Biologist
5 months ago by
Biologist150
Biologist150 wrote:

I have 4 samples. 2 control and 2 gene_oe (over expression) samples.

I wanted to do differential analysis between Gene_OE vs Control samples. I have the samples column data like following:

coldata:

Samples Type    Time
SampleA Control Day1
SampleB Control Day2
SampleD Gene_OE Day1
SampleE Gene_OE Day2

Using edgeR I did like following:

library(edgeR)
group <- factor(paste0(coldata$Type))

And created design matrix like following:

design2 <- model.matrix(~ 0 + group + coldata$Time)
desgin2

    Control Gene_OE day1 day2
1       1        0    0    0
2       1        0    1    0
3       0        1    0    0
4       0        1    1    0

I see some warning message :

y <- estimateDisp(y, design2, robust=TRUE)
Warning message:
In estimateDisp.default(y = y$counts, design = design, group = group,  :
  No residual df: setting dispersion to NA

And an error like below:

fit <- glmQLFit(y, design2, robust=TRUE)
Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset,  : 
  Design matrix not of full rank.  The following coefficients not estimable:
 day2

What could be the reason for this error? And how to resolve this error?

ADD COMMENTlink modified 5 months ago by h.mon27k • written 5 months ago by Biologist150
1
gravatar for h.mon
5 months ago by
h.mon27k
Brazil
h.mon27k wrote:

The design matrix is not full rank because you have only one sample (no biological replicates) per combination of treatment (type+time). You may either drop day from the analysis, or add more samples per treatment.

ADD COMMENTlink modified 4 months ago • written 5 months ago by h.mon27k

But If my coldata looks like below, I don't see any error:

coldata:

Samples Type    Time
SampleA Control Day1
SampleB Control Day2
SampleC Control Day3
SampleD Gene_OE Day1
SampleE Gene_OE Day2
SampleF Gene_OE Day3

design2 <- model.matrix(~ 0 + group + coldata$Time)
desgin2

    Control Gene_OE Day2 Day3
1       1        0    0    0
2       1        0    1    0
3       1        0    0    1
4       0        1    0    0
5       0        1    1    0
6       0        1    0    1

Is this right?

ADD REPLYlink modified 5 months ago • written 5 months ago by Biologist150
1

The above has the very same problem as your original post. You need more replicates per treatment - each day has only one sample, you need more per same day.

ADD REPLYlink written 5 months ago by h.mon27k

Could you please tell me whether the above way it is right or I should add more samples per treatment?

ADD REPLYlink modified 5 months ago • written 5 months ago by Biologist150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1446 users visited in the last hour