Pairwise comparisons between multiple groups with DESeq2
1
0
Entering edit mode
6.8 years ago
lazappi • 0

Hi

I have a set of 12 RNA-seq samples spread across 5 different groups (different tissues). At the moment I am focusing on making pairwise differential expression comparisons between the groups using DESeq2. I've noticed that different results are returned depending on whether I pass all the data to DESeq2 then request a contrast between two groups or instead manually select the groups I'm interested in then give just that data to DESeq2:

Metadata <- data.frame(name, count.file, group)

# Not sure if I should do this!
# des2.samples <- des2.samples[des2.samples$group %in% c("Group2", "Group3"), ] # Create DESeqDataSet object from HTSeq-count files des2.data <- DESeqDataSetFromHTSeqCount(des2.samples, design = ~ group, directory = "data") # Calculate DE and get results des2.data <- DESeq(des2.data) des2.res <- results(des2.data, contrast = c("group", "Group2", "Group3")) # Order by padj des2.res <- des2.res[order(des2.res$padj), ]

# Check out the summary
summary(des2.res)

I believe that the differences are likely due to how DESeq2 does its filtering but I'm unsure what the best approach is, particularly as one group is a clear outlier to the others and may skew the results? I'm also wondering if a similar affect would be seen with other packages (edgeR, DESeq, voom etc.) and whether they would need to be treated differently.

Thanks

RNA-Seq DESeq2 differential expression groups R • 5.4k views
1
Entering edit mode
6.8 years ago

It depends on you, if you want to capture overall differences (obviously skewed by one group being an outlier) then you can pool them together, otherwise you can do pairwise comparisons.

I have two scripts NB.R (based on DESeq2) and KW.R (based on Kruskal-Wallis with FDR) that you can use alternatively for finding taxa/genes with logfold changes. In KW.R I am applying log-relative normalisation first!

The scripts take a NxP dimensional count data with N being samples, and P being feature points (OTUs/genes and so on) and an Nx1 group data (as a data frame) with factor datatypes and generates a barplot for subset of these OTUs/genes that are significantly different.

You can find them here:

http://userweb.eng.gla.ac.uk/umer.ijaz/bioinformatics/ecological.html

Best Wishes,

Umer