Hi all, I would appreciate your help to better understand how to analyse our shRNA-Seq data. We have a shRNA-library from mouse. Similar to this paper, in Fig. 1 , we have two conditions (WT, KO) and two stages (before [b], after [a]), each with multiple replicates.
before samples - WT_b and KO_b after samples - WT_a and KO_a
Also we are doing a loff-of-function analysis, so we would like to see if there is a dropout in the KO compared to the WT.
I have searched for tools and found the edgeR tutorial, which gives several examples. But all this examples have only two conditions compared against each other. I have a read count table after using
segemehl to map and
HTseq-count to quantify the samples.
Do I need to take all four conditions into account or should I only compare the after samples (
KO_a vs. WT_a) to see if there are dropouts
should my experimental design include all the samples or only the last two
should it be something like that:
(KO_b / KO_a) - (WT_b / WT_b)
if my columns of the sampleData are
stage, like that:
sample condition stage WT_1 WT Input ... WT_6 WT after ... WT_10 WT after KO_1 KO Input ... KO_10 KO after
I would appreciate an idea of how to create the design matrix.
model.matrix(~stage) or a more complex design such as
model.matrix(~condition + condition:stage) would be here necessary?
thanks in advance