Question

DESeq2 discrepancy between analysis.

0

Entering edit mode

3.4 years ago

jordi.planells ▴ 480

Hi all! I am writing to you because I need help/advice on a DESEq2 analysis that I'm performing.

I have 4 different knockdowns (A,B,C,D) and 2 different chromatin fractions (short, long). I want to compare results within each chromatin fraction, let's say A is my mock KD, so B,C,D vs A in long fraction and separately B,C,D vs A in short fraction. I have 2 different options when it comes to design my analysis in DESeq2.

First option is to have everything in the same design matrix.

# Build DESeq2 object dds = DESeqDataSetFromMatrix(countData = toDE, colData = meta, design = ~group)

where group = meta$group = paste0(meta$Knockdown,"_",meta$Fraction)
I first relevel vs dds$group = relevel(dds$group, ref = "Long_A") to extract my results for long fraction.

I do the analysis and then relevel vs the short fraction.
dds$group = relevel(dds$group, ref = "Short_A") to get results in the short fraction.

My second option is to split the counts and metadata into long and short fractions and therefore do the analysis separately.

# Build DESeq2 object dds = DESeqDataSetFromMatrix(countData = toDE[,colnames(toDE) %in% rownames(meta)[meta$Fraction=="Long"]], colData = meta[meta$Fraction=="Long",], design = ~Knockdown)
then repeat the same analysis for the short fraction.

I have performed both approaches. My problem is that the amount of significant genes is way different depending on which approach I use.

Number of significant genes for first approach:

B_Long 278
B_Short 103
C_Long 98
C_Short 47
D_Long 396
D_Short 129

Number of significant genes for second approach:
B_Long 549
B_Short 218
C_Long 33
C_Short 6
D_Long 1108
D_Short 435

So my question is, which approach would you follow? The joint one despite the fact I don't want to compare fractions? Or the second one where I do two completely different analysis??
Thanks before hand,
Jordi

R RNA-Seq DESeq2 • 834 views

ADD COMMENT • link updated 3.4 years ago by swbarnes2 14k • written 3.4 years ago by jordi.planells ▴ 480

score 0 · Answer 1 · 2020-12-03

0

Entering edit mode

3.4 years ago

swbarnes2 14k

In general, put everything together, unless your PCA gives you reason to think that dispersion differs greatly between your long and short samples.

https://www.bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#if-i-have-multiple-groups-should-i-run-all-together-or-split-into-pairs-of-groups

ADD COMMENT • link 3.4 years ago by swbarnes2 14k

0

Entering edit mode

My samples cluster totally independent long and short fraction in a PCA, that's why I was wondering whether I have a situation as the one mentioned in the vignette... See the plot here

ADD REPLY • link 3.4 years ago by jordi.planells ▴ 480