RNA-seq combined treatment interaction experiment design
1
0
Entering edit mode
8 months ago
Manko47 ▴ 10

Hello everyone,

I have problem with designing further analysis for my RNA-seq project - mainly however I'm not sure if i'm not overcomplicating the things that I want to do. Therefore I would be glad for getting any advice

I have 12 samples from 4 conditions - control group (D) and 3 treatment groups A, B and C.

  • A condition is treated with substance X
  • B condition is treated with substance Y
  • C condition is treated with both substance X and Y

I did a simple metadata file with two columns (SampleNames and Treatment) and then did pairwise comparisons (A vs D), (B vs D) and (C vs D). Now since C is a combination of A and B treatment I was asked by my PI to compare if the Fold Change from combined treatment (C vs D) is statistically different from Fold Change of single treatments (A vs D and B vs D). So basically the question to be answered is whether

H0 - Log2FC(C vs D) - (LFC(AvsD) + LFC(B vs D))= 0

And HA would be the above difference is not 0

At first I've thought about doing another pairwise comparisons (A vs C) and (B vs C) and then trying to make any sense out of the overlapping targets. However it got me thinking that this is not the approach that would be the optimal at answering my question. Therefore I started googling things and reading DESeq2 manual about "interaction" analysis. And with that I've arrived at a possibility of doing something like this - add columns to my metadata indicating all samples where substance X is used, do the same for substance Y, and then create a full model "X + Y + X:Y" and a reduced model "X + Y" So now my metadata looks like this. 1 indicates presence of substance in that sample, 0 means lack of substance.

Id Treatment SubstanceX SubstanceY
A1 A 1 0
A2 A 1 0
A3 A 1 0
B1 B 0 1
B2 B 0 1
B3 B 0 1
AB1 C 1 1
AB2 C 1 1
AB3 C 1 1
D1 D 0 0
D2 D 0 0
D3 D 0 0

Then after loading tables I run the following code

dds_lrt <- DESeqDataSetFromMatrix(countData = count_table,
                              colData = pheno_data_lrt,
                              design = ~ SubstanceX + SubstanceY + SubstanceX:SubstanceY)
dds_lrt <- DESeq(dds_lrt, test = "LRT", reduced = ~  SubstanceX + SubstanceY)

And now after running resultNames(dds_lrt) one of my possibile analysis to extract is named "SubstanceX1.SubstanceY1".

And now the two questions that I want to ask are

  1. Do what I am doing makes sense at all for the purpose of my analysis and additionaly.
  2. If we assume the approach is correct did I even design the metadata and code correctly. And is this object from resultNames(dds_lrt) the one I want to extract. This is all a pretty new stuff for me so even assuming my work makes sense I'm not even sure If I'm extracting correct object.

Will be glad for any answers that I get.

Thank you

RNA-seq DESeq2 • 843 views
ADD COMMENT
2
Entering edit mode
8 months ago

This is definitely one way to do this analysis.

You need to be slightly careful about interpreting the results though.

Under the reduced model ~ X + Y, if the LFC for a gene is 5 when treated with X, and 5 when treated with Y, then the expectation is that the LFC of a sample treated with both X and X is 10.

Comparing this to the full model ~X + Y + X:Y will identify situations where this is not the case. It will identify cases both where treating with X + Y gives the same result as treating with either alone (LFC for C is 5), and cases where the treatments synergise, and the LFC is more (e.g. 15). The value of the SubstanceX1.SubstanceY1 coefficient will reflect this. In the former case above the coefficient will be -5, and in the latter case it will be 5. To get the total LFC of the combined treatment vs control, you'd have to sum the SubstanceX1, SubstanceY1 and the SubstranceX1.SubstanceY1 coefficients.

But the way you've worded your question:

I was asked by my PI to compare if the Fold Change from combined treatment (C vs D) is statistically different from Fold Change of single treatments (A vs D and B vs D)

The way you've written this, it sounds like that is not what your PI is looking for. Your PI thinks that if treatment with X gives a LFC of 5, then the null expectation is that treatment with both X and Y will also be 5, and they are looking for genes where when you treat with both X and Y, the LFC is different to 5. To test this, you'd fit the model ~X + X:Y vs ~X.

ADD COMMENT
0
Entering edit mode

Thank you for the answer and detailed explanation. This is really helpful as now I'm getting the general idea of how this kind of analysis works which in turn makes me able to judge whether this is what I am potentially interested in comparing.

Oh and as for the second part - in that case I'm sorry - I worded it badly. That kind of a comparision that you wrote in the first part is in fact exactly what my PI wants me to do. Nevertheless now that's my job to do it - you gave me all the insight that I needed proceed further, and actually have general idea of what exactly I'm doing, so thank you!

ADD REPLY

Login before adding your answer.

Traffic: 3486 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6