Question

Bulk RNA-seq, seeing strong differences between groups at a timepoint post-infection, but no strong interaction effect

1

Entering edit mode

18 months ago

Derrik ▴ 40

I have a time-series bulk RNAseq study with 10 individuals split into two groups receiving separate conditions. I have samples from pre-intervention and for several time points after intervention. The study design looks loosely like this:

Individual| Time | Group
----------------------------
1         |  0   | Condition 1 
2         |  0   | Condition 1
3         |  0   | Condition 1
4         |  0   | Condition 1
5         |  0   | Condition 1
6         |  0   | Condition 2
7         |  0   | Condition 2
8         |  0   | Condition 2
9         |  0   | Condition 2
10        |  0   | Condition 2
1         |  1   | Condition 1
2         |  1   | Condition 1
...
10        |  7   | Condition 2

I am using DESeq2 to analyze data, where my formula design is

Group:GroupID + Group * Time

Where Group:GroupID is to account for individual variances within the group. When I look at DEG at each timepoint within each group, I use the following contrasts in the results call

For condition 1:

name = 'Time_DX_vs_D0'

For condition 2:

contrast = list(c('Time_DX_vs_D0', 'GroupCondition1.TimeDX'))

And see the following trends for significantly DEG

enter image description here

Yet when I look at just the significant DEG in the interaction terms, I get this

enter image description here

To my eye, the spike in DEG in one group at day 4 relative to D0 that is not matched in the other group or timepoints suggests a strong interaction of condition and time, but the interaction term shows no significant DEG. Am I misinterpreting these results? How is there a strong difference between the baseline and one group:timepoint that apparently isn't have a strong interaction effect?

Time-series interaction RNA-seq DESeq2 • 634 views

ADD COMMENT • link 18 months ago by Derrik ▴ 40

score 1 · Answer 1 · 2022-11-02

1

Entering edit mode

18 months ago

i.sudbery 19k

Looking at numbers of significant genes to compare conditions can be misleading, because there are two reasons why genes might not come up as differentially expressed:

There is geniuinely not change in expression
Expression is sufficiently variable that the null hypothesis is not rejected even where there is a difference.

Your second graph suggests that at least one of your groups is too variable to be able to say for certain that genes are different reacting differently between conditions.

These sorts of designs have a lot of coefficients to fit, and so power to detect differences, especially in the interactions is low.

ADD COMMENT • link 18 months ago by i.sudbery 19k

0

Entering edit mode

Thanks, it is surprising as when looking at PCA and other measures of sample correlation, the samples within D4 were not obviously highly variant, especially relative to some of the other groups, but we've looked at things several ways and are confident there wasn't an error in metadata association, design, etc., so it does seem to be 'real' signal.

samRNA is condition 2 in this case

enter image description here

Additionally, we see in some heatmaps (plotting relative expression normalized to Day 0 median) that there does seem to be some signal differing between the treatments on D4, but maybe there is more variation than is obvious to us from these visualizations? I thought maybe the apparent shared patterns in D1 and D4 mean the difference observed in D4 is being accounted for in another term.

enter image description here

ADD REPLY • link 18 months ago by Derrik ▴ 40