Off topic:Multifactorial Design Formula In Edger
0
0
Entering edit mode
10.2 years ago

Dear All,

I am new to edgeR and still in the phase of reading the vignette in details to be able to use it for my data. I have a question in understanding the model.matrix. On page 27 (paragraph 3.3.2 "Nested interaction formulas"), the design is defined as:

> targets
Sample Treat Time
1 Sample1 Placebo 0h
2 Sample2 Placebo 0h
3 Sample3 Placebo 1h
4 Sample4 Placebo 1h
5 Sample5 Placebo 2h
6 Sample6 Placebo 2h
7 Sample1 Drug 0h
8 Sample2 Drug 0h
9 Sample3 Drug 1h
10 Sample4 Drug 1h
11 Sample5 Drug 2h
12 Sample6 Drug 2h

targets$Treat <- relevel(targets$Treat, ref="Placebo")

design <- model.matrix(~Treat + Treat:Time, data=targets)

and the coefficient names are:

> colnames(design)
[1] "(Intercept)" "TreatDrug"
[3] "TreatPlacebo:Time1h" "TreatDrug:Time1h"
[5] "TreatPlacebo:Time2h" "TreatDrug:Time2h"

Whereas on page 28 (paragraph 3.3.4 "Interaction at any time") the design formula looks like this (I added "2" in "design2" compared to original text for easier following):

> design2 <- model.matrix(~Treat + Time + Treat:Time, data=targets)
> colnames(design2)
[1] "(Intercept)" "TreatDrug"
[3] "Time1h" "Time2h"
[5] "TreatDrug:Time1h" "TreatDrug:Time2h"

It is explained that for the design2 (page 29 top): "The last two coefficients give the DrugvsPlacebo.1h and DrugvsPlacebo.2h contrasts, so that

> lrt <- glmLRT(fit, coef=5:6)

is useful because it detects genes that respond differently to the drug, relative to the placebo, at either of the times."

My question is, if I understood it well, in design2, why there are no coefficients "TreatPlacebo:Time1h" and "TreatPlacebo:Time2h"? And should't "Time1h" and "Time2h" be effects of time, no matter of the Treat(ment), and not: "

> lrt <- glmLRT(fit, coef=3)

and

> lrt <- glmLRT(fit, coef=4)

are the effects of the reference drug, i.e., the effects of the placebo at 1 hour and 2 hours" as it is written in the vignette text?


Why I need edgeR: I have an RNASeq experiment (~30 samples), where I need to explore the influence of 3 factors with 2 levels each:

  1. sex: f/m

  2. disease_state:healthy/cancer

  3. localization: blood/bones.

Question I want to answer: which genes are differentially expressed between 2 localisations in 2 disease states (i.e. are bones more severely affected by cancer than blood) taking into account different sex? I assume that my design formula should look like: design=~sex+disease+localization+disease:localization

Could anyone please tell me if the formula is correct? And, what should be the output? How could I know if the disease has different effects depending on the localization? By number of genes affected (=differentially expressed)?

I would appreciate very much if someone has some time to help me with any of the questions.

Best, Mike

edger • 5.0k views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2483 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6