DESeq2 analysis for multiple conditions
5
4
Entering edit mode
6.4 years ago

Hi,

I want to do DE analysis using DESeq2. My experiment is small RNAseq experiment with 5 tissue samples. I want to find out the DE genes between different tissues.

I have following matrix (as an example)

 gene tissue1 tissue2 tissue3 tissue4 tissue5 gene1 233 91 17 593 93 gene2 1011 0 7 1 11 gene3 963 2 3 66 2 gene4 908 41 1 74 33 gene5 596 50 26 328 104 gene6 1 0 0 0 1111 gene7 202 187 35 425 277 gene8 985 24 10 76 33 gene9 523 87 32 286 203 gene10 822 82 23 120 87

My aim is to find DE gene between each column, - i.e. tissue1 vs tissue2, tissue1 vs tissue3 ......tissue2 vs tissue3...... tissue4 vs tisue5

I dont fully understand the program, i tried following the program vignette as follows:

> Library("DESeq2")

gene    tissue1    tissue2    tissue3    tissue4
gene1    233    91    17    593
gene2    1011    0    7    1
gene3    963    2    3    66
gene4    908    41    1    74
gene5    596    50    26    328
gene6    1    0    0    0

> colData = data.frame(
+ row.names= colnames(CountTable),
+ condition = c("tissue1", "tissue2", "tissue3", "tissue4", "tissue5"),
+ libType = c( "single-end", "single-end", "single-end", "single-end", "single-end"))

> dds <- DESeqDataSetFromMatrix( countData = CountTable, colData = colData,
> design = ~ condition)
> dds <- DESeq(dds)
#estimating size factors
#estimating dispersions
#gene-wise dispersion estimates
#mean-dispersion relationship
#final dispersion estimates
#fitting model and testing
#Warning message:
#In checkForExperimentalReplicates(object, modelMatrix) :
#same number of samples and coefficients to fit,
#estimating dispersion by treating samples as replicates.
#read the ?DESeq section on 'Experiments without replicates'

>res <- results(dds)
>res

#log2 fold change (MAP): condition testis vs brain
#Wald test p-value: condition tissue5 vs tissue1
#DataFrame with 927 rows and 6 columns
baseMean log2FoldChange     lfcSE        stat    pvalue      padj

So in the result I am getting comparison only between tissue5 and tissue1, what do i need to do to find out comparison between each tissue?

Help is greatly appreciated.

P.S. : I am new to R and first time using DESeq2

RNA-Seq DESeq2 • 22k views
1
Entering edit mode

If you only have 5 samples, and they are all different, you can't do any kind of sophisticated statistical analysis on them. There is natural variance of expression, but without biological replicates, you have zero idea what it is. DESeq2 might give you numbers, but they don't mean much.

2
Entering edit mode
6.4 years ago

In looking up 'results' in the DESeq2 manual at http://www.bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf , I find the following information which suggests that it is only doing the first and last condition comparison:

The results table when printed will provide the information about the comparison, e.g. "log2 fold change (MAP): condition treated vs untreated", meaning that the estimates are of log2(treated /untreated), as would be returned by contrast=c("condition","treated","untreated"). Multiple results can be returned for analyses beyond a simple two group comparison, so results takes arguments contrast and name to help the user pick out the comparisons of interest for printing a results table. The use of the contrast argument is recommended for exact specification of the levels which should be compared and their order. If results is run without specifying contrast or name , it will return the comparison of the last level of the last variable in the design formula over the first level of this variable. For example, for a simple two-group comparison, this would return the log2 fold changes of the second group over the first group (the reference level). Please see examples below and in the vignette.

On a side note, did you look up the information in ?DESeq on 'Experiments without replicates' as the warning message says? I copied a little bit from the manual:

Experiments without replicates do not allow for estimation of the dispersion of counts around the expected value for each group, which is critical for differential expression analysis. If an experimental design is supplied which does not contain the necessary degrees of freedom for differential analysis, DESeq will provide a message to the user and follow the strategy outlined in Anders and Huber (2010) under the section 'Working without replicates', wherein all the samples are considered as replicates of a single group for the estimation of dispersion. As noted in the reference above: "Some overestimation of the variance may be expected, which will make that approach conservative." Furthermore, "while one may not want to draw strong conclusions from such an analysis, it may still be useful for exploration and hypothesis generation."

0
Entering edit mode
6.4 years ago

You can perform this analysis in Genestack platform. The DGE tool - Expression Navigator - is based on DESeq2 (or edgeR) R package. And it possible to find out the DE genes between multiple groups of samples (in your case, 5 groups according to tissue condition).

0
Entering edit mode
11 weeks ago
195472005 • 0

Deseq2 can return all results between all groups that you input.It seems that your trouhle took place in extract the results with this command:

results(dds) Without any parameter，this command is just like another command blow: results(dds,contrast=c("condition","tissue1", "tissue5")) So if you want results between other groups,you should input a new command with a modified parameter contrast like this results(dds,contrast=c("condition","tissue1", "tissue4")) # command to extrat results between tissue1 and tissue4 results(dds,contrast=c("condition","tissue1", "tissue3")) # command to extrat results between tissue1 and tissue3 There are some lines in manual ralation with this trouble: Multiple results can be returned for analyses beyond a simple two group comparison, so results takes arguments contrast and name to help the user pick out the comparisons of interest for printing a results table

0
Entering edit mode
11 weeks ago
195472005 • 0

Deseq2 can return all results between all groups that you input.It seems that your trouhle took place in extract the results with this command:

results(dds)

Without any parameter，this command is just like another command blow:

results(dds,contrast=c("condition","tissue1", "tissue5"))

So if you want results between other groups,you should input a new command with a modified parameter contrast like this

results(dds,contrast=c("condition","tissue1", "tissue4")) # command to extrat results between tissue1 and tissue4

results(dds,contrast=c("condition","tissue1", "tissue3")) # command to extrat results between tissue1 and tissue3

There are some lines in manual ralation with this trouble:

Multiple results can be returned for analyses beyond a simple two group comparison, so results takes arguments contrast and name to help the user pick out the comparisons of interest for printing a results table

0
Entering edit mode
11 weeks ago
195472005 • 0

    Error in checkForExperimentalReplicates(object, modelMatrix) :
The design matrix has the same number of samples and coefficients to fit,
so estimation of dispersion is not possible. Treating samples
as replicates was deprecated in v1.20 and no longer supported since v1.22.


Multiple results can be returned for analyses beyond a simple two group comparison, so results takes arguments contrast and name to help the user pick out the comparisons of interest for printing a results table

results(dds,contrast=c("condition","tissue1", "tissue4")) # 提取 tissue1 and tissue4 的差异分析结果

results(dds,contrast=c("condition","tissue1", "tissue3")) # 提取 tissue1 and tissue3 的差异分析结果