Hi, I’m doing my first RNAseq analysis and am struggling to choose the correct model for my analysis aims. The study design: I have a number of study volunteers (100), each of whom has been treated with one of five specific treatments. Samples were taken before treatment (time =1) and after treatment with three different time points (time 2-4).
Volunteer Treatment Time 1 A 1 1 A 2 1 A 3 1 A 4 2 B 1 2 B 2 2 B 3 2 B 4 …
I want to compare treatments in general as well as treatment in relation to time points.
I was considering the following models:
1.) Volunteer + treatment:time
2.) Volunteer + treatment + treatment:time
3.) Treatment + time
4.) Treatment + treatment:time
For options 1) and 2), I’m not sure how much of an issue it is that the information of treatment and volunteer is redundant to some extent (as the treatment is the same across all samples of that volunteer). However, some volunteer-specific effects can be expected and I would like to control for them. Are any of these the correct option or do you have other suggestions? I’m particularly unsure how to handle the fact that the first time point for each volunteer is essentially a control for that specific volunteer and treatment.