Question

DESeq2 with multiple variable give me different results

3

Entering edit mode

4.0 years ago

Rafael Soler ★ 1.3k

I am doing a DESeq2 comparison with different levels and one factor. To do this, I have performed the analysis in two different ways.

First, putting all the samples in the same DESeq object and then extracting each comparison:

> sampleinfo
    FileName    SampleName  Status
    A_1_count   A_1 A       
    A_2_count   A_2 A       
    B_3_count   B_3 B       
    B_4_count   B_4 B       
    C_5_count   C_5 C   
    C_6_count   C_6 C   
    D_7_count   D_7 D   
    D_8_count   D_8 D   
    E_9_count   E_9 E   
    E_10_count  E_10    E

dds <- DESeqDataSetFromMatrix(countData = cts,
                                colData = sampleinfo,
                              design = ~ Status)

dds$Status <- relevel(dds$Status, ref = "E")

And the results:

dds <- DESeq(dds)
res_A <- results(dds,name="Status_A_vs_E")
res_B <- results(dds,name="Status_B_vs_E")
res_C <- results(dds,name="Status_C_vs_E")
res_D <- results(dds,name="Status_D_vs_E")

And doing these comparisons one by one separately on different DESeq objects.

> sampleinfo_A
    FileName    SampleName  Status
    A_1_count   A_1 A       
    A_2_count   A_2 A       
    E_9_count   E_9 E   
    E_10_count  E_10    E

> sampleinfo_B
    FileName    SampleName  Status
    B_3_count   B_3 B       
    B_4_count   B_4 B       
    E_9_count   E_9 E   
    E_10_count  E_10    E

dds_A <- DESeqDataSetFromMatrix(countData = cts_A,
                                colData = sampleinfo_A,
                              design = ~ Status)

dds_B <- DESeqDataSetFromMatrix(countData = cts_B,
                                colData = sampleinfo_B,
                              design = ~ Status)

And the results:

dds_A <- DESeq(dds_A)
res_A <- results(dds_A)

dds_B <- DESeq(dds_B)
res_B <- results(dds_B)

(Repeat for each condition)

However, the results give me different between the 2 methods. Does anyone know why is this happening? How it is the correct way to compare all to E?

Thank you!

factor DESeq2 levels • 2.1k views

ADD COMMENT • link updated 4.0 years ago by andres.firrincieli 3.9k • written 4.0 years ago by Rafael Soler ★ 1.3k

score 3 · Accepted Answer · 2021-11-02

3

Entering edit mode

4.0 years ago

andres.firrincieli 3.9k

However, the results give me different between the 2 methods. Does anyone know why is this happening?

Because of the dispersion estimates. By including more samples you will get better dispersion estimates

ADD COMMENT • link 4.0 years ago by andres.firrincieli 3.9k

0

Entering edit mode

If I get better estimates of dispersion, is the variance better reflected in the gene expression for a given mean value? So you think it is a better way to compare all groups vs E?

ADD REPLY • link 4.0 years ago by Rafael Soler ★ 1.3k

1

Entering edit mode

So you think it is a better way to compare all groups vs E?

Generally speaking, that would be the best strategy. There are only few cases where splitting the dataset before variance estimation might be the best strategy (see ATpoint answer) .