Question: DESeq2 : three factor conditions
18
6.1 years ago by
Belgium, Brussels
Nicolas Rosewick9.2k wrote:

Hi,

I've a small question concerning differential expression analysis. So I've a 12 samples divided into three groups (A, B and C ). If a do that :

```samples <- read.table("samples.txt",sep="\t",as.is=T)
colnames(samples)<-c("sampleName","fileName","condition")
dds1 <- DESeqDataSetFromHTSeqCount(sampleTable=samples,directory="./",design= ~ condition)
dds1 <- DESeq(dds1) res <- results(dds1)```

What will the p-value represent ? if it's significatif, is it because the gene of interest is up or down regulated in one of the three groups ?

To do pairwise comparison, I do that. Is it correct ?

```res.A.B <- results(dds1, contrast=c("condition","A","B"))
res.A.C <- results(dds1, contrast=c("condition","A","C"))
res.B.C <- results(dds1, contrast=c("condition","B","C"))```

Thanks

mulitple conditions deseq2 • 17k views
modified 6.1 years ago by Devon Ryan97k • written 6.1 years ago by Nicolas Rosewick9.2k
14
6.1 years ago by
Devon Ryan97k
Freiburg, Germany
Devon Ryan97k wrote:

By default, DESeq2 uses a Wald test to compute the p-values, so it's likely that `res` just has the output of the A vs. C contrast (rather than using a likelihood ratio test to compare `~condition` versus `~1`). Regardless of the test, the direction of change and the p-value are separate, so you'll have significant findings in both directions.

Yes, the method you showed for the pairwise comparisons looks correct.

BTW, if you really do want to test whether something is changed due to condition in general (i.e., do something like a one-way ANOVA), you can just use the LRT methods. Keep in mind that the direction of change there can be questionable, since you're then asking if the full model fits better than the reduced rather than asking of parameter estimates (i.e., fold-changes) are different than 0 (or a different threshold, should you specify something else...though that's not common).

3

Thanks Devon. For LRT is that ok ?

```ddsLRT <- DESeq(dds1, test="LRT", reduced= ~ 1)
resLRT <- results(ddsLRT)```
1

Yup, that's exactly what I'd use.

ok great ! thanks again. I accept the answer ;)

May I add some follow up questions please? (should I open a new discussion)

If one does an LRT test with ~condition as a full model and ~1 as a reduced model then the results names for the dds object are: "Intercept", "conditionA_vs_B", "conditionC_vs_B". I think I understood that the pvalues are the same whatever the "name" argument used for results(). But the LogFoldChanges seems to be dependent on the "name" argument. So I was wondering:

-How can one compare  condition A versus C, as it is not within the results names?

-My understanding is that if one wants pairwise comparisons then Wald tests should be used. So what are those LogFold Changes and how can they be used?

Many thanks

There is a lot of information for you in `?results`

See the 'test' argument and the 'contrast' argument descriptions for example. And see the section On p-values under Details.

In short: LRT gives you one p-value for there being any difference across the 3 levels. If you want to do pairwise comparisons, use 'contrast' and test="Wald". You can run Wald tests simply using results() if you are using version >= 1.6.

1

Hello,

Thanks a lot for your answer. My understanding from the ?results is that:

If I have a "condition" factor such as c(A, B,C) and if I use a full model ~condition and a reduced model ~1 (with test= LRT), I will generate results with the following names: Intercept, condition_B_vs_A, condition_C_vs_A. Then I do understand that I can compare B versus A by doing:

`results(dds, name=" condition_B_vs_A" , test="Wald")`

and C versus A by doing:

`results(dds, name=" condition_C_vs_A" , test="Wald")`

What I really am not sure about is how to compare B and C with the results name I have. I do see that I would have to use "contrast". But my question is beyond DESeq2 I think, it is the formulas and constrasts in general that I am having difficulties with. A wild guess would be:

`results(dds, constrast=c(0,1,-1) , test="Wald")`

because (B-A)-(C-A) = B-C. Does it make any sense please?

Hi Aurelie,

you can modify the condition to c_A, b_B, a_C instead of A, B, C. This time, you can have condition_b_B_vs_a_C.

Aifu

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by afli190