Question

Interaction Factors Removing All Variability

0

Entering edit mode

4.0 years ago

tyak9569 • 0

Hi all!

I've been running into some wackiness when it comes to the creation of my DESeq object during the analysis of my RNA-Seq data. To give you some background, we have 8 total grouping, with two genotypes (transgenic and control), two treatment conditions, and two time points. For my masters' project, I put in our three independent variables into my DESeq object like so:

dds = DESeqDataSetFromMatrix(countData = counts, colData = coldata, design= ~treatment + time + genotype)

I ended up with differentially expressed genes with an effect of each variable besides treatment. In a previous experiment with a similar paradigm, we noted a significant genotype:time interaction, and I wanted to include that in the model just to see if anything new popped up. What I found was that including the interaction into my DESeq object completely removed any variability in my RNASeq data set, completely flattening my MA plots and not identifying any genes of interest. My code is as follows:

dds = DESeqDataSetFromMatrix(countData = counts1, colData = coldata, design = ~ time:genotype + treatment)

keep <- rowSums(counts(dds)) >= 20
dds <- dds[keep,]

dds$treatment <- relevel(dds$treatment, ref = "same")
dds$genotype <- relevel(dds$genotype, ref = "ctrl")

dds = DESeq(dds)

pdf("all_samples_dispersionPlot.pdf")
plotDispEsts(dds )
dev.off()

normed = counts(dds, normalized=TRUE)
write.csv(normed, file="all_samples_normCounts.csv")

res <- results(dds)
write.csv(as.data.frame(res), file="all_samples_results.csv")
pdf("all_samples_MAPlot.pdf")
DESeq2::plotMA(res, alpha = 0.1, ylim=c(-3,3), cex=.4)
abline(h=c(-1,1), col="dodgerblue", lwd=2)
dev.off()

Here's what my MA plot looked like previously:

https://imgur.com/2RqBl6M

Here's what it looks like with the interaction thrown in:

https://imgur.com/a/AAk6Wng

If you guys have any suggestions or explanation that I'm not aware of I'd love to hear it. My initial conclusion is that all of the variability in gene expression is due to the interaction, and that my data is basically useless, and I'm just hoping I'm wrong.

Thanks!

DESeq DESeq2 RNA-Seq R rna-seq • 1.0k views

ADD COMMENT • link updated 3.8 years ago by antonioggsousa 3.2k • written 4.0 years ago by tyak9569 • 0

score 0 · Answer 1 · 2020-07-02

Hi,

I believe that when you're using a model formula like yours, with multiple factors and/or interactions, DESeq2 will provide you multiple pairwise comparisons, but the last contrast will be used to plot dispersion, retrieve results, etc, unless that you specify the contrast or name of the specific pairwise comparison that you want to retrieve

So, you can check which contrast is being used by ploMA() by doing resultsNames(dds). I would say that in both cases you're plotting different pairwise comparisons, that's why you're getting such a different result.

To call a specific pairwise comparison to plot the MA, you can do after running the resultsNames(dds):

plotMA(results(dds, name=<name-specific-pairwise-comparison-given-by-resultsNames(dds)> ))

To assess variability the best way is doing a PCA or other multidimensional reduction technique and do not rely on this MA plot, because this plot is a consequence of what you can see on PCA before, that is, if you don't have variability among the groups that you want to explore, you'll not find genes diferentially expressed.

I hope this helps to answer your question!

António