Dear all, I have a question regarding the way I am interpreting my results. I really hope you can give me a hand. I have four groups that I am analyzing: 1) WT - vehicle 2) WT - treatment 3) Mut - vehicle 4) Mut - treatment
I am trying to get a list of DE genes, affected by the treatment. As a first step, I ran DESeq2, following the official vignette:
dds <- DESeqDataSetFromTximport(txi, colData = coldata, design = ~ genotype + condition) #set levelels dds$condition <- relevel(dds$condition, ref = "vehicle") dds$genotype <- relevel(dds$genotype, ref = "WT") #I run the deseq function dds <- DESeq(dds) #and then I got the results using "condition Treat vs vehicle", to get the comparison between treated and untreated res <- results(dds, name = "condition_Treat_vs_vehicle") # and then I got the results using "genotype_Mut_vs_WT", to get the comparison between treated and untreated res2 <- results(dds, name="genotype_Mut_vs_WT")
In order to understand which genes are affected only by treatment without any genotype effect, I then ran the LRT test.
dds_lrt <- DESeq(dds, test="LRT", reduced = ~genotype) res_LRT <- results(dds_lrt)
Then I selected the significant genes via
padj <0.05 for
And I overlapped these lists of genes using a Venn Diagram function to see how many were actually affected only from my treatment.
Now, I doubt that this comparison is actually not correct because I should only use the results from my LRT test, without considering the ones obtained via the Wald test.
It would be great to get more insights on this topic. Please let me know if there is any information missing that might help the answer.
Looking forward to your feedback
Thank you in advance :)
Your first contrast
"condition_Treat_vs_vehicle"is already looking for the effect of treatment while controlling for genotype. Since you only have two factor levels in
conditionthe LRT test should give you similar results to the Wald test used in
results, since the Wald test approximates the LRT, but is designed for pairwise comparisons.
The LRT might be a good choice if you have many pairwise comparisons but still want a single results table to look for genes that change over this large set of groups. For small number of groups, like three or so I would always use the Wald test because it tests against a fold change and therefore allows to filter for a minimal effect size. The LRT compares the goodness-of-fit, not fold changes. For stats people that might be intuitive, for me it never was, my brain is centered on fold changes.
thank you for your replies.. Actually, when I try to compare the significant genes obtained from
res2. I do not have the same overlapping as you can see in the picture Venn Diagram with res = Treat, res2= Geno, all genes selected for padj <0.05
Your LRT model excluding Treatment and your Treatment Wald test results share almost all of the same DEGs, which is as expected.
Oh ok, I see, thanks. So, for my question (DE genes upon treatment), I can focus my attention on the 423 that are affected only by a treatment effect, am I correct? And what about the 6 genes only in the Treatment? I am sorry for the many questions, this is the first set of data I am trying to analyze by myself and I want to make sure to get the right interpretation :)
In your case you don't need to run the LRT. Simply use the results from the treatment contrast. Make sure to use the log fold shrinkage method for DESeq2 also, and ideally setting a log2 fold change threshold within the function itself.
if I have a single factor which Subtype and it has 6 levels which is M0 to M5.
I have done the following
Since I did this as I had quite a number of pairwise comparison which in fact I did but at the same I ran this LRT.
How should I interpret this here my reference level is set to M0 sub-type. If I have to find genes that are differential across all sub-type how should I go ahead?
Should I combine all the genes in each of these
"FAB_M1_vs_M0" "FAB_M2_vs_M0" "FAB_M3_vs_M0" "FAB_M4_vs_M0" "FAB_M5_vs_M0"
I would like to get a suggestion