Hello everyone,
I have problem with designing further analysis for my RNA-seq project - mainly however I'm not sure if i'm not overcomplicating the things that I want to do. Therefore I would be glad for getting any advice
I have 12 samples from 4 conditions - control group (D) and 3 treatment groups A, B and C.
- A condition is treated with substance X
- B condition is treated with substance Y
- C condition is treated with both substance X and Y
I did a simple metadata file with two columns (SampleNames
and Treatment
) and then did pairwise comparisons (A vs D), (B vs D) and (C vs D). Now since C is a combination of A and B treatment I was asked by my PI to compare if the Fold Change from combined treatment (C vs D) is statistically different from Fold Change of single treatments (A vs D and B vs D). So basically the question to be answered is whether
H0 - Log2FC(C vs D) - (LFC(AvsD) + LFC(B vs D))= 0
And HA
would be the above difference is not 0
At first I've thought about doing another pairwise comparisons (A vs C) and (B vs C) and then trying to make any sense out of the overlapping targets. However it got me thinking that this is not the approach that would be the optimal at answering my question. Therefore I started googling things and reading DESeq2 manual about "interaction" analysis. And with that I've arrived at a possibility of doing something like this - add columns to my metadata indicating all samples where substance X is used, do the same for substance Y, and then create a full model "X + Y + X:Y" and a reduced model "X + Y" So now my metadata looks like this. 1 indicates presence of substance in that sample, 0 means lack of substance.
Id | Treatment | SubstanceX | SubstanceY |
---|---|---|---|
A1 | A | 1 | 0 |
A2 | A | 1 | 0 |
A3 | A | 1 | 0 |
B1 | B | 0 | 1 |
B2 | B | 0 | 1 |
B3 | B | 0 | 1 |
AB1 | C | 1 | 1 |
AB2 | C | 1 | 1 |
AB3 | C | 1 | 1 |
D1 | D | 0 | 0 |
D2 | D | 0 | 0 |
D3 | D | 0 | 0 |
Then after loading tables I run the following code
dds_lrt <- DESeqDataSetFromMatrix(countData = count_table,
colData = pheno_data_lrt,
design = ~ SubstanceX + SubstanceY + SubstanceX:SubstanceY)
dds_lrt <- DESeq(dds_lrt, test = "LRT", reduced = ~ SubstanceX + SubstanceY)
And now after running resultNames(dds_lrt)
one of my possibile analysis to extract is named "SubstanceX1.SubstanceY1".
And now the two questions that I want to ask are
- Do what I am doing makes sense at all for the purpose of my analysis and additionaly.
- If we assume the approach is correct did I even design the metadata and code correctly. And is this object from resultNames(dds_lrt) the one I want to extract. This is all a pretty new stuff for me so even assuming my work makes sense I'm not even sure If I'm extracting correct object.
Will be glad for any answers that I get.
Thank you
Thank you for the answer and detailed explanation. This is really helpful as now I'm getting the general idea of how this kind of analysis works which in turn makes me able to judge whether this is what I am potentially interested in comparing.
Oh and as for the second part - in that case I'm sorry - I worded it badly. That kind of a comparision that you wrote in the first part is in fact exactly what my PI wants me to do. Nevertheless now that's my job to do it - you gave me all the insight that I needed proceed further, and actually have general idea of what exactly I'm doing, so thank you!