RNA-Seq | DESeq2 | differentially expressed genes | Dispersion
1
0
Entering edit mode
19 days ago
Dominic • 0

Dear community,

I am wondering about the 'correct' approach for my differential gene expression analysis using DESeq2.

Background: I am dealing with the following set of 48 samples: plant material sampled at four developmental stages with four different tissues. Each stage_x_tissue combination was sequenced with three biological replicates. I am interested in the differential expression of genes between different stages (for each tissue; time series), as well as the differences between tissues (at each particular stage).

As the samples have quite diverse transcriptome profiles, I am not sure which of the following approaches is the more appropriate:

1) Subsetting the data before the analysis and doing a loop of only pair-wise comparisons: e.g. 'stage 1 | tissue 1' vs. 'stage 2 | tissue 1'

dds <- DESeqDataSetFromMatrix(countData = cts, colData = meta, design = ~ subset)

or

2) Analysing all samples together by defining groups based on a linear combination of 'stage' x 'tissue' and using contrasts

dds <- DESeqDataSetFromMatrix(countData = cts, colData = meta, design = ~ group)

With both approaches I get similar results - very significant p-values with approx. 2,300 DEGs (p-value fdr < 0.01 & |log2FC| > 2) and an overlap of approx. 85% of the identified DEGs between the two approaches.

However, what unsettles me is that the dispersion estimates look quite different. For the pairwise comparison with only 6 samples at a time (Fig.1), the estimates are shrunken, as I would expect, but the ~ group approach with all 48 samples results in estimates nearly not shrunken at all (Fig.2)?

Thank you very much in advance for the feedback and advice!

Best regards, D

Fig.1 Fig.2

DEGs Dispersion DESeq2 • 244 views
ADD COMMENT
0
Entering edit mode

I'm interested in an answer to this also, I could be way off here, but my thinking is I would separate this into 2 separate differential expression analyses.

As you said it yourself: 1) I am interested in the differential expression of genes between different stages (for each tissue; time series),

2) as well as the differences between tissues (at each particular stage)

Not sure if combining it all into one is a good idea. Also, a PCA may help if you do stick with this plan.

Looking forward to reading what the more experienced suggest.

ADD REPLY

Login before adding your answer.

Traffic: 1199 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6