I was exploring my DE results from DESeq2 and noticed that some of my DEGs seemed to be particularly enriched in groups with a particular combination of factors. Therefore, I decided to carry out an interaction analysis using the following code (I include the gene Hspa1a as an example of a gene higher in samples with XY chromosomes and testes):
# Define formula including interaction term
formula <- as.formula("~chromosomes + gonads + chromosomes:gonads")
dds <- DESeqDataSetFromMatrix(countData = round(counts_filtered),
colData = metadata_filtered,
design = formula)
# PRemove genes with all samples <10 counts")
toRemove <- rowSums(counts(dds) < 10) == ncol(dds) # ncol(dds) equals N.samples
dds <- dds[!toRemove,]
### DE analysis
dds <- DESeq(dds)
# From metadata:
# chromosomes: Factor with 2 levels: "XX", "XY"
# gonads: Factor with 2 levels: "ov", "te"
# Adapted from ?results:
# the gonad effect for chromosomes XX (the main effect)
results(dds, name="gonads_te_vs_ov") %>% as.data.frame() %>% rownames_to_column("gene") %>% dplyr::filter(gene == "Hspa1a")
# # the gonad effect for chromosomes XY
# this is, by definition, the main effect *plus* the interaction term
# (the extra condition effect in XY compared to genotype XX).
results(dds, list( c("gonads_te_vs_ov", "chromosomesXY.gonadste") )) %>% as.data.frame() %>% rownames_to_column("gene") %>% dplyr::filter(gene == "Hspa1a")
# the interaction term, answering: is the gonad effect *different* across sex chromosome complement?
results(dds, name="chromosomesXY.gonadste") %>% as.data.frame() %>% rownames_to_column("gene") %>% dplyr::filter(gene == "Hspa1a")
### Plotting
vst_normCounts <- assay(vst(dds, blind=TRUE)) %>%
as.data.frame() %>%
rownames_to_column("symbol") %>%
dplyr::filter(symbol == "Hspa1a") %>%
pivot_longer(!symbol, names_to = "id", values_to = "vst_exp")
# Get the useful metadata columns for this interaction and add to counts
vst_columns <- c("id", "chromosomes", "gonads")
metadata_vst <- metadata_filtered %>%
rownames_to_column("id") %>%
dplyr::select(all_of(vst_columns))
vstPlot_df <- vst_normCounts %>%
dplyr::left_join(metadata_vst, by = "id")
ggplot(vstPlot_df, aes(x = chromosomes, y = vst_exp, color = gonads)) +
geom_point(size = 3) +
scale_colour_discrete(name = str_to_title(colnames(vstPlot_df)[5])) +
ggtitle(vstPlot_df$symbol[1]) +
xlab("Sex chromosome complement") +
ylab("Normalised expression") +
theme_light(base_size = 18) +
theme(plot.title = element_text(hjust = 0.5))
The results commands output the following:
> results(dds, name="gonads_te_vs_ov") %>% as.data.frame() %>% rownames_to_column("gene") %>% dplyr::filter(gene == "Hspa1a")
gene baseMean log2FoldChange lfcSE stat pvalue padj
1 Hspa1a 21847.06 0.2173161 0.1878391 1.156927 0.2473022 0.9416609
>
> # the gonad effect for chromosomes XY
> # this is, by definition, the main effect *plus* the interaction term
> # (the extra condition effect in XY compared to genotype XX).
> results(dds, list( c("gonads_te_vs_ov", "chromosomesXY.gonadste") )) %>% as.data.frame() %>% rownames_to_column("gene") %>% dplyr::filter(gene == "Hspa1a")
gene baseMean log2FoldChange lfcSE stat pvalue padj
1 Hspa1a 21847.06 0.8390135 0.2028182 4.136777 3.522188e-05 0.008823081
>
> # the interaction term, answering: is the gonad effect *different* across sex chromosome complement?
> results(dds, name="chromosomesXY.gonadste") %>% as.data.frame() %>% rownames_to_column("gene") %>% dplyr::filter(gene == "Hspa1a")
gene baseMean log2FoldChange lfcSE stat pvalue padj
1 Hspa1a 21847.06 0.6216974 0.2764394 2.248947 0.02451588 0.9996121
How is it possible that there is such a clear difference in values between gonads for XY samples compared to XX samples but the interaction term is nowhere close to significant? Am I not understanding properly how interactions work? Did I do something wrong in my code, or a larger effect size would be needed for significance? See the plot below: