Dear all,
I have repeated RNASeq results from a cohort. Everyone has samples at baseline before treatment is started (time = 0). Some are then treated, some arent (sadly not randomised)
I was wanting to assess the impact of treatment on gene expression using DESeq2.
At the moment my columns are set up like:
ID   time   treated
1     0        y  
1     1        y
2     0        n
2     1        n
3     0        y
3     1        y
... [14 paired results in total]
I was therefore wanting to look at difference in differences of paired samples. The DESeq2 design code
design = ~ + ID + time*treated
does not work because it is not full rank becuase 'treated' is the same amongst the same ID.
I have therefore created a rather convoluted extra variable: delta_treated and delta_untreated:
ID   time   treated  delta_treated  delta_untreated
1     0        y          0               0
1     1        y          1               0
2     0        n          0               0
2     1        n          0               1
3     0        y          0               0
3     1        y          1               0
... [14 paired results in total]
This therefore gets rid of the full rank issue, and is basically coding in the interaction I wanted. My design is therefore:
design = ~ ID + delta_treated + delta_untreated
res <- results(dds, contrast = list(c("delta_treated"), c("delta_untreated")))
This all strikes me as a convoluted way around. But I couldn't get it to work by changing the treated to 'n' at time = 0, and running ID + visit + treated, as I couldn't get the difference in differences results out.
Is there a better way to do this?? No let me rephrase: what is the better way to do this?
Additionally, is there any good way for accounting for baseline differences between the groups at all? Thanks!