Question

RNA-seq combined treatment interaction experiment design

0

Entering edit mode

10 months ago

Manko47 ▴ 10

Hello everyone,

I have problem with designing further analysis for my RNA-seq project - mainly however I'm not sure if i'm not overcomplicating the things that I want to do. Therefore I would be glad for getting any advice

I have 12 samples from 4 conditions - control group (D) and 3 treatment groups A, B and C.

A condition is treated with substance X
B condition is treated with substance Y
C condition is treated with both substance X and Y

I did a simple metadata file with two columns (SampleNames and Treatment) and then did pairwise comparisons (A vs D), (B vs D) and (C vs D). Now since C is a combination of A and B treatment I was asked by my PI to compare if the Fold Change from combined treatment (C vs D) is statistically different from Fold Change of single treatments (A vs D and B vs D). So basically the question to be answered is whether

H0 - Log2FC(C vs D) - (LFC(AvsD) + LFC(B vs D))= 0

And HA would be the above difference is not 0

At first I've thought about doing another pairwise comparisons (A vs C) and (B vs C) and then trying to make any sense out of the overlapping targets. However it got me thinking that this is not the approach that would be the optimal at answering my question. Therefore I started googling things and reading DESeq2 manual about "interaction" analysis. And with that I've arrived at a possibility of doing something like this - add columns to my metadata indicating all samples where substance X is used, do the same for substance Y, and then create a full model "X + Y + X:Y" and a reduced model "X + Y" So now my metadata looks like this. 1 indicates presence of substance in that sample, 0 means lack of substance.

Id	Treatment	SubstanceX	SubstanceY
A1	A	1	0
A2	A	1	0
A3	A	1	0
B1	B	0	1
B2	B	0	1
B3	B	0	1
AB1	C	1	1
AB2	C	1	1
AB3	C	1	1
D1	D	0	0
D2	D	0	0
D3	D	0	0

Then after loading tables I run the following code

dds_lrt <- DESeqDataSetFromMatrix(countData = count_table,
                              colData = pheno_data_lrt,
                              design = ~ SubstanceX + SubstanceY + SubstanceX:SubstanceY)
dds_lrt <- DESeq(dds_lrt, test = "LRT", reduced = ~  SubstanceX + SubstanceY)

And now after running resultNames(dds_lrt) one of my possibile analysis to extract is named "SubstanceX1.SubstanceY1".

And now the two questions that I want to ask are

Do what I am doing makes sense at all for the purpose of my analysis and additionaly.
If we assume the approach is correct did I even design the metadata and code correctly. And is this object from resultNames(dds_lrt) the one I want to extract. This is all a pretty new stuff for me so even assuming my work makes sense I'm not even sure If I'm extracting correct object.

Will be glad for any answers that I get.

Thank you

RNA-seq DESeq2 • 932 views

ADD COMMENT • link updated 10 months ago by Ram 45k • written 10 months ago by Manko47 ▴ 10

score 2 · Accepted Answer · 2025-01-16

This is definitely one way to do this analysis.

You need to be slightly careful about interpreting the results though.

Under the reduced model ~ X + Y, if the LFC for a gene is 5 when treated with X, and 5 when treated with Y, then the expectation is that the LFC of a sample treated with both X and X is 10.

Comparing this to the full model ~X + Y + X:Y will identify situations where this is not the case. It will identify cases both where treating with X + Y gives the same result as treating with either alone (LFC for C is 5), and cases where the treatments synergise, and the LFC is more (e.g. 15). The value of the SubstanceX1.SubstanceY1 coefficient will reflect this. In the former case above the coefficient will be -5, and in the latter case it will be 5. To get the total LFC of the combined treatment vs control, you'd have to sum the SubstanceX1, SubstanceY1 and the SubstranceX1.SubstanceY1 coefficients.

But the way you've worded your question:

I was asked by my PI to compare if the Fold Change from combined treatment (C vs D) is statistically different from Fold Change of single treatments (A vs D and B vs D)

The way you've written this, it sounds like that is not what your PI is looking for. Your PI thinks that if treatment with X gives a LFC of 5, then the null expectation is that treatment with both X and Y will also be 5, and they are looking for genes where when you treat with both X and Y, the LFC is different to 5. To test this, you'd fit the model ~X + X:Y vs ~X.

Id	Treatment	SubstanceX	SubstanceY
A1	A	1	0
A2	A	1	0
A3	A	1	0
B1	B	0	1
B2	B	0	1
B3	B	0	1
AB1	C	1	1
AB2	C	1	1
AB3	C	1	1
D1	D	0	0
D2	D	0	0
D3	D	0	0

Id	Treatment	SubstanceX	SubstanceY
A1	A	1	0
A2	A	1	0
A3	A	1	0
B1	B	0	1
B2	B	0	1
B3	B	0	1
AB1	C	1	1
AB2	C	1	1
AB3	C	1	1
D1	D	0	0
D2	D	0	0
D3	D	0	0

Id	Treatment	SubstanceX	SubstanceY
A1	A	1	0
A2	A	1	0
A3	A	1	0
B1	B	0	1
B2	B	0	1
B3	B	0	1
AB1	C	1	1
AB2	C	1	1
AB3	C	1	1
D1	D	0	0
D2	D	0	0
D3	D	0	0