Question: DESeq2 analysis for multiple conditions
3
gravatar for eager_learner
4.3 years ago by
France
eager_learner40 wrote:

Hi,

I want to do DE analysis using DESeq2. My experiment is small RNAseq experiment with 5 tissue samples. I want to find out the DE genes between different tissues.

I have following matrix (as an example)

gene tissue1 tissue2 tissue3 tissue4 tissue5
gene1 233 91 17 593 93
gene2 1011 0 7 1 11
gene3 963 2 3 66 2
gene4 908 41 1 74 33
gene5 596 50 26 328 104
gene6 1 0 0 0 1111
gene7 202 187 35 425 277
gene8 985 24 10 76 33
gene9 523 87 32 286 203
gene10 822 82 23 120 87

My aim is to find DE gene between each column, - i.e. tissue1 vs tissue2, tissue1 vs tissue3 ......tissue2 vs tissue3...... tissue4 vs tisue5

I dont fully understand the program, i tried following the program vignette as follows:

> Library("DESeq2")                
> CountTable  = read.table("test.tsv", header=TRUE, row.names=1)                
> head(CountTable)            
             
gene    tissue1    tissue2    tissue3    tissue4
gene1    233    91    17    593
gene2    1011    0    7    1
gene3    963    2    3    66
gene4    908    41    1    74
gene5    596    50    26    328
gene6    1    0    0    0
                
> colData = data.frame(                
+ row.names= colnames(CountTable),                
+ condition = c("tissue1", "tissue2", "tissue3", "tissue4", "tissue5"),                
+ libType = c( "single-end", "single-end", "single-end", "single-end", "single-end"))                
                
> dds <- DESeqDataSetFromMatrix( countData = CountTable, colData = colData,           
> design = ~ condition)                
> dds <- DESeq(dds)                
#estimating size factors                
#estimating dispersions                
#gene-wise dispersion estimates                
#mean-dispersion relationship                
#final dispersion estimates                
#fitting model and testing                
#Warning message:                
#In checkForExperimentalReplicates(object, modelMatrix) :                
 #same number of samples and coefficients to fit,                
 #estimating dispersion by treating samples as replicates.                
 #read the ?DESeq section on 'Experiments without replicates'                
                
>res <- results(dds)                
>res                
                
#log2 fold change (MAP): condition testis vs brain                 
#Wald test p-value: condition tissue5 vs tissue1                 
#DataFrame with 927 rows and 6 columns                
                   baseMean log2FoldChange     lfcSE        stat    pvalue      padj   

 

So in the result I am getting comparison only between tissue5 and tissue1, what do i need to do to find out comparison between each tissue?

Help is greatly appreciated.

P.S. : I am new to R and first time using DESeq2

 

rna-seq deseq2 • 15k views
ADD COMMENTlink modified 23 months ago by Biostar ♦♦ 20 • written 4.3 years ago by eager_learner40

If you only have 5 samples, and they are all different, you can't do any kind of sophisticated statistical analysis on them. There is natural variance of expression, but without biological replicates, you have zero idea what it is. DESeq2 might give you numbers, but they don't mean much.

ADD REPLYlink written 23 months ago by swbarnes26.5k
2
gravatar for eromasko
4.3 years ago by
eromasko120
United States
eromasko120 wrote:

In looking up 'results' in the DESeq2 manual at http://www.bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf , I find the following information which suggests that it is only doing the first and last condition comparison: 

 The results table when printed will provide the information about the comparison, e.g. "log2 fold change (MAP): condition treated vs untreated", meaning that the estimates are of log2(treated /untreated), as would be returned by contrast=c("condition","treated","untreated") . Multiple results can be returned for analyses beyond a simple two group comparison, so results  takes arguments contrast  and name  to help the user pick out the comparisons of interest for printing a results table. The use of the contrast  argument is recommended for exact specification of the levels which should be compared and their order. If results is run without specifying contrast  or name , it will return the comparison of the last level of the last variable in the design formula over the first level of this variable. For example, for a simple two-group comparison, this would return the log2 fold changes of the second group over the first group (the reference level). Please see examples below and in the vignette.

On a side note, did you look up the information in ?DESeq on 'Experiments without replicates' as the warning message says? I copied a little bit from the manual : 

Experiments without replicates do not allow for estimation of the dispersion of counts around the expected value for each group, which is critical for differential expression analysis. If an experimental design is supplied which does not contain the necessary degrees of freedom for differential analysis, DESeq  will provide a message to the user and follow the strategy outlined in Anders and Huber (2010) under the section ’Working without replicates’, wherein all the samples are considered as replicates of a single group for the estimation of dispersion. As noted in the reference above: "Some overestimation of the variance may be expected, which will make that approach conservative." Furthermore, "while one may not want to draw strong conclusions from such an analysis, it may still be useful for exploration and hypothesis generation." 

 

ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by eromasko120
0
gravatar for Evgeniia Golovina
4.3 years ago by
New Zealand
Evgeniia Golovina990 wrote:

You can perform this analysis in Genestack platform. The DGE tool - Expression Navigator - is based on DESeq2 (or edgeR) R package. And it possible to find out the DE genes between multiple groups of samples (in your case, 5 groups according to tissue condition).

ADD COMMENTlink written 4.3 years ago by Evgeniia Golovina990
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2234 users visited in the last hour