deseq2 how to define data.frame with multiple treatments and multiple replicates?
1
1
Entering edit mode
6.7 years ago
madkitty ▴ 620

I've posted that question on a different board, just in case it would have more exposure here, here is what I'm struggling with DESeq2. I'm experiencing issues with the replicates of our experiment. We should be comparing 3 types of treatments in one heatmap. 

  • Ctrl vs T
  • Ctrl vs B
  • Ctrl vs A


Each of which have 3 replicates. The following code works very well when I don't use replicates, and simply compare one treatment vs control. Also, when calling mcols, it should show the 3 comparisons we will use in our heatmap, though it only says CTRL vs T (if I don't use replicates) or T vs A in this case.

So here are my two questions: 

  1. Is there a better way to define samples$condition?
  2. How should I use mcols to have the 3 comparisons?


Thanks  (ps: I'm not an R expert)

# DESeq1 libraries

library( "DESeq2" )
library("Biobase")

# Heatmap libraries
library(RColorBrewer)
library( "genefilter" )
library(gplots) 

# Start loading matrix data
clba = read.table("matrix_duplicates_merged_CLBA.txt", header=TRUE, row.names=1)
head(clba)

samplesclba <- data.frame(row.names=c("C1", "C2", "C3", "T1", "T2", "T3", "B1", "B2", "B3", "A1", "A2", "A3"), condition=as.factor(c(rep("C",3), rep("T", 3), rep("B", 3), rep("A", 3))))

## Relevel doesn't work with replicates
samples$condition <- relevel(samples$condition, "C")
Error in samples$condition : object of type 'closure' is not subsettable

 

Then here is the second part of the code, where I'm experiencing issues with the dataframe and mcols. I'm not defining data.frame very well, and I've explored results and coef() function unsuccessfully. 

# Launch DESeq2

ddsclba <- DESeqDataSetFromMatrix(countData = as.matrix(clba), colData=samplesclba, design=~condition)
ddsclba <- DESeq(ddsclba, betaPrior=FALSE)

At this point, when I don't use replicate with samples$condition, I have a clear comparison of Control vs Treatment T. But it's unclear if Control vs. B and Control vs. A are happening in the log calculation. Here since I use replicates, I can't relevel to Control, though even if I did, it wouldn't show the 3 comparisons I'm looking to have, ie, C vs. T, C vs. B and C vs. A


# Results
resclba <- results( ddsclba )
mcols(resclba, use.names=TRUE)
DataFrame with 6 rows and 2 columns
                       type                           description
                <character>                           <character>
baseMean       intermediate           the base mean over all rows
log2FoldChange      results  log2 fold change: condition T vs A
lfcSE               results    standard error: condition T vs A
stat                results    Wald statistic: condition T vs A
pvalue              results Wald test p-value: condition T vs A
padj                results                  BH adjusted p-values

# Heatmap with top 35 genes
rldclba <- rlogTransformation(ddsclba)
topVarGenesclba <- order( rowVars( assay(rldclba) ), decreasing=TRUE ) [1:35]
hmcol <- colorRampPalette( rev(brewer.pal(9, "RdBu")))(255)

heatmap.2( assay(rldclba)[ topVarGenesclba, ], Colv=FALSE, scale="row", trace="none", dendrogram="row", col = hmcol
deseq2 RNA-Seq heatmap • 6.6k views
ADD COMMENT
2
Entering edit mode
6.7 years ago
Michael Love ★ 2.2k

link to answers on the crosspost: http://seqanswers.com/forums/showthread.php?t=45347

ADD COMMENT

Login before adding your answer.

Traffic: 1636 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6