So, I have 3 isolates of bacteria. For each isolate, I have a control and a treatment condition. For each isolate/condition, I have bio triplicates which have been run through the proper workflow to create countfiles (18 countfiles).
What I want to do is to compare control and treatment for each isolate. I'm not so interested in comparing the changes in control and treatment across the isolates. Obviously, I could just do three pairwise comparisons (run the analysis three times, once for each isolate), but the local statistician (who is not versed in R) noted that that would distort the error estimates.
My original analysis model was:
dehs<-DESeqDataSetFromHTSeqCount(sampleTable=matrix, directory=directory, design= ~ strain+condition+strain:condition)
This gives column names of:
resultsNames(dehs) "Intercept" "strain_F9_vs_F10" "strain_SH_vs_F10" "condition_H_vs_C" "strainF9.conditionH" "strainSH.conditionH"
I had been extracting the "condition_H vs_C" and changing which strain was specified as the reference strain by which strain's results I wanted to look at. The statistician believes that is incorrect. Is it?
Instead, he suggests the following: "If you remove the condition main effect but still include the interaction,
dehs<-DESeqDataSetFromHTSeqCount(sampleTable=matrix, directory=directory, design= ~ strain+strain:condition)
"then I think you should get column names of:
resultsNames(dehs) "Intercept" "strain_F9_vs_F10" "strain_SH_vs_F10" "strainF9.conditionH" "strainSH.conditionH" "strainF10.conditionH"
He notes: "The last three columns should be H vs. C for each of the three strains. You won’t get a condition main effect test since it is “absorbed” into the interaction."
Will the statistician's approach work, or should I stick with pairwise comparisons?