Hi,
I have transcriptomics dataset from 4 different clones i.e 1_1, 2_1, 3_11, 4_9. Out of these first two i.e 1_1, 2_1 are Wildtype and last two i.e 3_11 and 4_9 is KO.
So, 1_1 and 2_1; 3_11, 4_9 are biological replicates
Then all these samples were measured across 3 different time points t1,t2,t3.
I know from initial PCA that there is a strong effect of clones within the expression which is not what I am interested in,instead I am interested to find difference because of genotype (KO vs WT).
I have tried already the suggestions as mentioned in the vignette under "“Model matrix not full rank” section by adding a number identical within each nested sample.
The coldata/metadata is as follows:
sample grp cnd cnd ind.n
S01 WT 1_1 t1 1
S05 WT 1_1 t2 1
S09 WT 1_1 t3 1
S02 WT 2_1 t1 2
S06 WT 2_1 t2 2
S10 WT 2_1 t3 2
S03 KO 3_11 t1 3
S07 KO 3_11 t2 3
S11 KO 3_11 t3 3
S04 KO 4_9 t1 4
S08 KO 4_9 t2 4
S12 KO 4_9 t3 4
So, then I can account for the effect of time and also the effect of technical replicate across time... The formula is ~grp + ind+ grp:ind+ ind:cnd
mm<-model.matrix(~ grp + grp:ind.n + grp:cnd, coldata)
dds <- DESeqDataSetFromMatrix(countData = cts,
colData = coldata,
design = ~1 )
dds <- DESeq(dds, full = mm)
However, I still get the same error.. I cannot have for each grp/genotype a matching clone/cnd since it is not possible biologically for cell to be both wt and Ko.
Any idea what to do?