Question: DESeq2 doesn't compare the right ctrl vs treatment?
1
gravatar for madkitty
5.0 years ago by
madkitty580
Canada
madkitty580 wrote:

We have a very simple experiment, 1 control (C) and 3 independent treatments (T, B, A). So technically I wrote the following script thinking it would compare C to T, then C to B and finally C to A.

Though at the bottom of my pipeline, it seems that DESeq2 is comparing treatments T vs A. Is there a better way to write the condition so it would compare independentely control to each treatment ? (After that I should cluster genes and output a heatmap)

 

# DESeq1 libraries
library( "DESeq2" )
library("Biobase")
# Heatmap libraries
library( "genefilter" )
library(gplots) 

 

r2 = read.table("matrix.txt", header=TRUE, row.names=1)
head(r2)
samples <- data.frame(row.names=c("C1", "T1", "B1", "A1"), condition=as.factor(c("C1", "T1", "B1", "A1")))
dds <- DESeqDataSetFromMatrix(countData = as.matrix(r2), colData=samples, design=~condition)

# Run DESeq2
dds <- DESeq(dds)

# DESeq2 Results
res <- results(dds)
head (res)


**********
mcols(res, use.names=TRUE)
> mcols(res, use.names=TRUE)
DataFrame with 6 rows and 2 columns
                       type                                  description
                <character>                                  <character>
baseMean       intermediate                  the base mean over all rows
log2FoldChange      results log2 fold change (MAP): condition T1 vs A1
lfcSE               results         standard error: condition T1 vs A1
stat                results         Wald statistic: condition T1 vs A1
pvalue              results      Wald test p-value: condition T1 vs A1
padj                results                         BH adjusted p-values

rna-seq deseq2 • 7.2k views
ADD COMMENTlink modified 5.0 years ago by Devon Ryan91k • written 5.0 years ago by madkitty580
6
gravatar for Devon Ryan
5.0 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

This has nothing to do with DESeq2. It can't read your mind that you want C1 to be the base level. You just need to samples$condition <- relevel(samples$condition, "C1") before making dds.

ADD COMMENTlink written 5.0 years ago by Devon Ryan91k

Oh Thanks, I see where my mistake was. I re-wrote the script, can you confirm this is the proper way to use it?

samples <- data.frame(row.names=c("C1", "T1", "B1", "A1"), condition=as.factor(c("C1", "T1", "B1", "A1")))

samples$condition <- relevel(samples$condition, "C1")
dds <- DESeqDataSetFromMatrix(countData = as.matrix(r2), colData=samples$condition, design=~condition)

 

ADD REPLYlink written 5.0 years ago by madkitty580

That should be correct.

When you make a factor, the levels are ordered lexicographically and most programs will just assume that the first level in the list of levels (i.e., from levels(samples$condition), not the first value of samples$condition) is the base level for comparisons. To do otherwise would open a very large bag of worms (e.g., if you assume that some factors starting with C are control, then you'd piss off people working on cancer:normal comparisons).

ADD REPLYlink written 5.0 years ago by Devon Ryan91k

Unfortunately, it returns an error message. 

I have a first column with gene names (before C1), which should be row.names. Am I using data.frame properly? Should I adapt row.names? 

 

> samples <- data.frame(row.names=c("C1", "T1", "B1", "A1"), condition=as.factor(c("C1", "T1", "B1", "A1")))
> samples$condition <- relevel(samples$condition, "C1")

 

> dds <- DESeqDataSetFromMatrix(countData = as.matrix(r2), colData=samples$condition, design=~condition)
Error in validObject(.Object) : 
  invalid class “SummarizedExperiment” object: invalid object for slot "colData" in class "SummarizedExperiment": got class "factor", sho
uld be or extend class "DataFrame"

ADD REPLYlink written 5.0 years ago by madkitty580

dds <- DESeqDataSetFromMatrix(countData = as.matrix(r2), colData=samples, design=~condition)

ADD REPLYlink written 5.0 years ago by Devon Ryan91k

Thanks! it seems to be working well now. Though when I run mcols, it says T1 vs C1. What happened to treatment A and B? Is it still comparing them separately versus the control C1?

> mcols(res, use.names=TRUE)
DataFrame with 6 rows and 2 columns
                       type                                     description
                <character>                                     <character>
baseMean       intermediate                     the base mean over all rows
log2FoldChange      results log2 fold change (MAP): condition T1 vs C1
lfcSE               results         standard error: condition T1 vs C1
stat                results         Wald statistic: condition T1 vs C1
pvalue              results      Wald test p-value: condition T1 vs C1
padj                results                            BH adjusted p-values

 

ADD REPLYlink modified 5.0 years ago • written 5.0 years ago by madkitty580

The results() function has a 'coef' option (or something like that) that you can use to extract the results for the other coefficients.
 

ADD REPLYlink written 5.0 years ago by Devon Ryan91k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 707 users visited in the last hour