I have a general question regarding the correct use of estimateDisp function of edgeR.
Suppose you have the following situation:
Sample Condition A1 Control A2 Control B1 Treatment1 B2 Treatment1 B3 Treatment1 C1 Treatment2 C2 Treatment2 C3 Treatment2
and you want to compare Control vs Treatment1, Control vs Treatment2, Treatment1 vs Treatment2.
The data matrix contains around 12.000 genes on raws and 8 columns (samples).
Is it possible and correct to apply estimateDisp function only on the subset of samples you want to compare (i.e. Control vs Treatment1)? In other words when I compare Control vs Treatment1 the design matrix should contain also Treatment2?
Suppose a second situation, i.e. you want to compare Treatment1 vs Treatment2 only. You never use samples A1 and A2 for some reasons. In this case, is it correct to retain the control samples while estimating the dispersion?
Although to me is pretty clear how to go on, I'm a little bit confused by the practical use (beyond the theory) of the function by some bioinformaticians.
Thank you in advance.