Dear all,
I have repeated RNASeq results from a cohort. Everyone has samples at baseline before treatment is started (time = 0). Some are then treated, some arent (sadly not randomised)
I was wanting to assess the impact of treatment on gene expression using DESeq2.
At the moment my columns are set up like:
ID time treated
1 0 y
1 1 y
2 0 n
2 1 n
3 0 y
3 1 y
... [14 paired results in total]
I was therefore wanting to look at difference in differences of paired samples. The DESeq2 design code
design = ~ + ID + time*treated
does not work because it is not full rank becuase 'treated' is the same amongst the same ID.
I have therefore created a rather convoluted extra variable: delta_treated and delta_untreated:
ID time treated delta_treated delta_untreated
1 0 y 0 0
1 1 y 1 0
2 0 n 0 0
2 1 n 0 1
3 0 y 0 0
3 1 y 1 0
... [14 paired results in total]
This therefore gets rid of the full rank issue, and is basically coding in the interaction I wanted. My design is therefore:
design = ~ ID + delta_treated + delta_untreated
res <- results(dds, contrast = list(c("delta_treated"), c("delta_untreated")))
This all strikes me as a convoluted way around. But I couldn't get it to work by changing the treated to 'n' at time = 0, and running ID + visit + treated, as I couldn't get the difference in differences results out.
Is there a better way to do this?? No let me rephrase: what is the better way to do this?
Additionally, is there any good way for accounting for baseline differences between the groups at all? Thanks!