Question

On the practical use of estimateDisp function

1

Entering edit mode

5.0 years ago

elb ▴ 250

Hi guys,

I have a general question regarding the correct use of estimateDisp function of edgeR.

Suppose you have the following situation:

Sample      Condition
A1          Control
A2          Control
B1          Treatment1
B2          Treatment1
B3          Treatment1
C1          Treatment2
C2          Treatment2
C3          Treatment2

and you want to compare Control vs Treatment1, Control vs Treatment2, Treatment1 vs Treatment2.

The data matrix contains around 12.000 genes on raws and 8 columns (samples).

Is it possible and correct to apply estimateDisp function only on the subset of samples you want to compare (i.e. Control vs Treatment1)? In other words when I compare Control vs Treatment1 the design matrix should contain also Treatment2?

Suppose a second situation, i.e. you want to compare Treatment1 vs Treatment2 only. You never use samples A1 and A2 for some reasons. In this case, is it correct to retain the control samples while estimating the dispersion?

Although to me is pretty clear how to go on, I'm a little bit confused by the practical use (beyond the theory) of the function by some bioinformaticians.

Thank you in advance.

E.

RNA-Seq EDGEr • 1.2k views

ADD COMMENT • link updated 5.0 years ago by Ram 43k • written 5.0 years ago by elb ▴ 250

1

Entering edit mode

I would ask this on Bioconductor support forum.

ADD REPLY • link 5.0 years ago by Kevin Blighe 87k