Pairwise comparisons between multiple groups with DESeq2
1
0
Entering edit mode
9.1 years ago
lazappi • 0

Hi

I have a set of 12 RNA-seq samples spread across 5 different groups (different tissues). At the moment I am focusing on making pairwise differential expression comparisons between the groups using DESeq2. I've noticed that different results are returned depending on whether I pass all the data to DESeq2 then request a contrast between two groups or instead manually select the groups I'm interested in then give just that data to DESeq2:

Metadata <- data.frame(name, count.file, group)
des2.samples <- Metadata

# Not sure if I should do this!
# des2.samples <- des2.samples[des2.samples$group %in% c("Group2", "Group3"), ]

# Create DESeqDataSet object from HTSeq-count files
des2.data <- DESeqDataSetFromHTSeqCount(des2.samples, design = ~ group, 
                                        directory = "data")

# Calculate DE and get results
des2.data <- DESeq(des2.data)
des2.res <- results(des2.data, contrast = c("group", "Group2", "Group3"))

# Order by padj
des2.res <- des2.res[order(des2.res$padj), ]

# Check out the summary
summary(des2.res)

I believe that the differences are likely due to how DESeq2 does its filtering but I'm unsure what the best approach is, particularly as one group is a clear outlier to the others and may skew the results? I'm also wondering if a similar affect would be seen with other packages (edgeR, DESeq, voom etc.) and whether they would need to be treated differently.

Thanks

RNA-Seq DESeq2 differential expression groups R • 5.9k views
ADD COMMENT
1
Entering edit mode
9.1 years ago

It depends on you, if you want to capture overall differences (obviously skewed by one group being an outlier) then you can pool them together, otherwise you can do pairwise comparisons.

I have two scripts NB.R (based on DESeq2) and KW.R (based on Kruskal-Wallis with FDR) that you can use alternatively for finding taxa/genes with logfold changes. In KW.R I am applying log-relative normalisation first!

The scripts take a NxP dimensional count data with N being samples, and P being feature points (OTUs/genes and so on) and an Nx1 group data (as a data frame) with factor datatypes and generates a barplot for subset of these OTUs/genes that are significantly different.

You can find them here:

http://userweb.eng.gla.ac.uk/umer.ijaz/bioinformatics/ecological.html

Best Wishes,

Umer

ADD COMMENT

Login before adding your answer.

Traffic: 3173 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6