Question

DESeq2 design: longitudinal two timepoint study with treatment effect

0

Entering edit mode

8 weeks ago

jonstug • 0

Dear all,

I have repeated RNASeq results from a cohort. Everyone has samples at baseline before treatment is started (time = 0). Some are then treated, some arent (sadly not randomised)

I was wanting to assess the impact of treatment on gene expression using DESeq2.

At the moment my columns are set up like:

ID   time   treated
1     0        y  
1     1        y
2     0        n
2     1        n
3     0        y
3     1        y
... [14 paired results in total]

I was therefore wanting to look at difference in differences of paired samples. The DESeq2 design code

design = ~ + ID + time*treated

does not work because it is not full rank becuase 'treated' is the same amongst the same ID.

I have therefore created a rather convoluted extra variable: delta_treated and delta_untreated:

ID   time   treated  delta_treated  delta_untreated
1     0        y          0               0
1     1        y          1               0
2     0        n          0               0
2     1        n          0               1
3     0        y          0               0
3     1        y          1               0
... [14 paired results in total]

This therefore gets rid of the full rank issue, and is basically coding in the interaction I wanted. My design is therefore:

design = ~ ID + delta_treated + delta_untreated
res <- results(dds, contrast = list(c("delta_treated"), c("delta_untreated")))

This all strikes me as a convoluted way around. But I couldn't get it to work by changing the treated to 'n' at time = 0, and running ID + visit + treated, as I couldn't get the difference in differences results out.

Is there a better way to do this?? No let me rephrase: what is the better way to do this?

Additionally, is there any good way for accounting for baseline differences between the groups at all? Thanks!

longitudinal treatment DESeq2 effect • 3.8k views

ADD COMMENT • link updated 8 weeks ago by LChart 5.1k • written 8 weeks ago by jonstug • 0

score 0 · Answer 1 · 2025-09-02

Everyone has samples at baseline before treatment is started (time = 0).

It seems the issue is you've coded entire samples as treated, when the baseline t=0 should be untreated everywhere [if the above statement is true]. You should have:

iid    time    treatment
  1       0            n
  1       1            y
  2       0            n
  2       1            n
  3       0            n
  3       1            y

so that ~ iid + time + treatment will properly isolate the treament effect from sample and timepoint (fixed) effects.