Question

DeSeq2 how to change design matrix when samples are exclusive: error in “Model matrix not full rank” checkFullRank(full)

0

Entering edit mode

4 months ago

totoroGirl • 0

Hi,

I have transcriptomics dataset from 4 different clones i.e 1_1, 2_1, 3_11, 4_9. Out of these first two i.e 1_1, 2_1 are Wildtype and last two i.e 3_11 and 4_9 is KO.

So, 1_1 and 2_1; 3_11, 4_9 are biological replicates

Then all these samples were measured across 3 different time points t1,t2,t3.

I know from initial PCA that there is a strong effect of clones within the expression which is not what I am interested in,instead I am interested to find difference because of genotype (KO vs WT).

I have tried already the suggestions as mentioned in the vignette under "“Model matrix not full rank” section by adding a number identical within each nested sample.

The coldata/metadata is as follows:

sample grp cnd cnd ind.n

S01 WT 1_1 t1 1

S05 WT 1_1 t2 1

S09 WT 1_1 t3 1

S02 WT 2_1 t1 2

S06 WT 2_1 t2 2

S10 WT 2_1 t3 2

S03 KO 3_11 t1 3

S07 KO 3_11 t2 3

S11 KO 3_11 t3 3

S04 KO 4_9 t1 4

S08 KO 4_9 t2 4

S12 KO 4_9 t3 4

So, then I can account for the effect of time and also the effect of technical replicate across time... The formula is ~grp + ind+ grp:ind+ ind:cnd

mm<-model.matrix(~ grp + grp:ind.n + grp:cnd, coldata)

dds <- DESeqDataSetFromMatrix(countData = cts,
                              colData = coldata,
                              design = ~1 )

dds <- DESeq(dds, full = mm)

However, I still get the same error.. I cannot have for each grp/genotype a matching clone/cnd since it is not possible biologically for cell to be both wt and Ko.

Any idea what to do?

Deseq2 • 931 views

ADD COMMENT • link updated 4 months ago by ATpoint 90k • written 4 months ago by totoroGirl • 0

ATpoint · Accepted Answer · 2025-06-26

This is an example of what the edgeR manual refers to as a "with and between" design. See page 44 of the edgeR user's guide here: https://www.bioconductor.org/packages/devel/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf

You can do exactly the same in DESeq2. See the discussion in the DESeq2 manual here: https://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#group-specific-condition-effects-individuals-nested-within-groups

However, even better would be to treat the patient effects as a random effect in a mixed effect model. You can achieve this to some extent by using limma::voom together with limma's duplicateCorrelation() function.