DESeq2 with multiple variable give me different results
1
3
Entering edit mode
11 months ago
Rafael Soler ★ 1.1k

I am doing a DESeq2 comparison with different levels and one factor. To do this, I have performed the analysis in two different ways.

First, putting all the samples in the same DESeq object and then extracting each comparison:

> sampleinfo
    FileName    SampleName  Status
    A_1_count   A_1 A       
    A_2_count   A_2 A       
    B_3_count   B_3 B       
    B_4_count   B_4 B       
    C_5_count   C_5 C   
    C_6_count   C_6 C   
    D_7_count   D_7 D   
    D_8_count   D_8 D   
    E_9_count   E_9 E   
    E_10_count  E_10    E

dds <- DESeqDataSetFromMatrix(countData = cts,
                                colData = sampleinfo,
                              design = ~ Status)

dds$Status <- relevel(dds$Status, ref = "E")

And the results:

dds <- DESeq(dds)
res_A <- results(dds,name="Status_A_vs_E")
res_B <- results(dds,name="Status_B_vs_E")
res_C <- results(dds,name="Status_C_vs_E")
res_D <- results(dds,name="Status_D_vs_E")

And doing these comparisons one by one separately on different DESeq objects.

> sampleinfo_A
    FileName    SampleName  Status
    A_1_count   A_1 A       
    A_2_count   A_2 A       
    E_9_count   E_9 E   
    E_10_count  E_10    E

> sampleinfo_B
    FileName    SampleName  Status
    B_3_count   B_3 B       
    B_4_count   B_4 B       
    E_9_count   E_9 E   
    E_10_count  E_10    E

dds_A <- DESeqDataSetFromMatrix(countData = cts_A,
                                colData = sampleinfo_A,
                              design = ~ Status)

dds_B <- DESeqDataSetFromMatrix(countData = cts_B,
                                colData = sampleinfo_B,
                              design = ~ Status)

And the results:

dds_A <- DESeq(dds_A)
res_A <- results(dds_A)

dds_B <- DESeq(dds_B)
res_B <- results(dds_B)

(Repeat for each condition)

However, the results give me different between the 2 methods. Does anyone know why is this happening? How it is the correct way to compare all to E?

Thank you!

factor DESeq2 levels • 599 views
ADD COMMENT
3
Entering edit mode
11 months ago

However, the results give me different between the 2 methods. Does anyone know why is this happening?

Because of the dispersion estimates. By including more samples you will get better dispersion estimates

ADD COMMENT
0
Entering edit mode

If I get better estimates of dispersion, is the variance better reflected in the gene expression for a given mean value? So you think it is a better way to compare all groups vs E?

ADD REPLY
1
Entering edit mode

So you think it is a better way to compare all groups vs E?

Generally speaking, that would be the best strategy. There are only few cases where splitting the dataset before variance estimation might be the best strategy (see ATpoint answer) .

ADD REPLY
0
Entering edit mode

Thank you! Very helpful :)

ADD REPLY

Login before adding your answer.

Traffic: 721 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6