Question: Pairwise comparisons between multiple groups with DESeq2
gravatar for lazappi
5.6 years ago by
lazappi0 wrote:


I have a set of 12 RNA-seq samples spread across 5 different groups (different tissues). At the moment I am focusing on making pairwise differential expression comparisons between the groups using DESeq2. I've noticed that different results are returned depending on whether I pass all the data to DESeq2 then request a contrast between two groups or instead manually select the groups I'm interested in then give just that data to DESeq2:

Metadata <- data.frame(name, count.file, group)
des2.samples <- Metadata

# Not sure if I should do this!
# des2.samples <- des2.samples[des2.samples$group %in% c("Group2", "Group3"), ]

# Create DESeqDataSet object from HTSeq-count files <- DESeqDataSetFromHTSeqCount(des2.samples, design = ~ group, 
                                        directory = "data")

# Calculate DE and get results <- DESeq(
des2.res <- results(, contrast = c("group", "Group2", "Group3"))

# Order by padj
des2.res <- des2.res[order(des2.res$padj), ]

# Check out the summary

I believe that the differences are likely due to how DESeq2 does its filtering but I'm unsure what the best approach is, particularly as one group is a clear outlier to the others and may skew the results? I'm also wondering if a similar affect would be seen with other packages (edgeR, DESeq, voom etc.) and whether they would need to be treated differently.


ADD COMMENTlink modified 5.6 years ago by umer.zeeshan.ijaz1.8k • written 5.6 years ago by lazappi0
gravatar for umer.zeeshan.ijaz
5.6 years ago by
Glasgow, UK
umer.zeeshan.ijaz1.8k wrote:

It depends on you, if you want to capture overall differences (obviously skewed by one group being an outlier) then you can pool them together, otherwise you can do pairwise comparisons.

I have two scripts NB.R (based on DESeq2) and KW.R (based on Kruskal-Wallis with FDR) that you can use alternatively for finding taxa/genes with logfold changes. In KW.R I am applying log-relative normalisation first!

The scripts take a NxP dimensional count data with N being samples, and P being feature points (OTUs/genes and so on) and an Nx1 group data (as a data frame) with factor datatypes and generates a barplot for subset of these OTUs/genes that are significantly different.

You can find them here:

Best Wishes,


ADD COMMENTlink written 5.6 years ago by umer.zeeshan.ijaz1.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 999 users visited in the last hour