Question

comparing each time point with deseq2 and their results

1

Entering edit mode

9.5 years ago

bioinforupesh2009.au ▴ 140

Using One condition and 4 times, I have compared each Time point and got a little surprising results - all comparisons have same pvalues and padj. Can anyone tell me if what I am doing is correct and my doubt it valuable

Indeed, I also figure out that baseMean in between two time points should not be the same. But I have it. And I really don't know why?

script in DESeq2:

> # design
> ExpDesign <- data.frame(row.names=colnames(ko), condition = design)

> ExpDesign
        condition.label condition.Time
KO.T1.1              KO             T1
KO.T1.2              KO             T1
KO.T1.3              KO             T1
KO.T2.1              KO             T2
...
.......

> ##DESeqDataSet
> bckCDS <- DESeqDataSetFromMatrix(countData = ko, colData=ExpDesign, design= ~condition.Time)
> ddsLRT= DESeq(bckCDS, test ="LRT", fitType='local', reduced=~1)
> resultsNames(ddsLRT)
> resultsNames(ddsLRT)
[1] "Intercept"               "condition.Time_T2_vs_T1" "condition.Time_T3_vs_T1"
[4] "condition.Time_T4_vs_T1"

> ## by default first (T1) and last (T4) comparison
> resLRT=results(ddsLRT, cooksCutoff=F, independentFiltering = F)
> resLRT$symbol <- mcols(ddsLRT)$symbol
> head(resLRT[order(resLRT$pvalue),],2)
log2 fold change (MLE): condition.Time T4 vs T1
LRT p-value: '~ condition.Time' vs '~ 1'
DataFrame with 4 rows and 6 columns
                     baseMean log2FoldChange     lfcSE      stat       pvalue         padj
                    <numeric>      <numeric> <numeric> <numeric>    <numeric>    <numeric>
mmu-miR-145b         35.51911     -0.4172776 0.3513721  56.82489 2.800864e-12 3.431058e-09
mmu-miR-181a-2-3p   144.74966     -0.8433754 0.5111826  42.11996 3.783699e-09 1.792311e-06
> sum(resLRT$padj < 0.05)
[1] 203

> ## for each time point
> lfc_resKO_T2_T1 <- results(ddsLRT,cooksCutoff=F, independentFiltering = F, contrast=c("condition.Time","T2","T1"))
> head(lfc_resKO_T2_T1[order(lfc_resKO_T2_T1$pvalue),],2)
log2 fold change (MLE): condition.Time T2 vs T1
LRT p-value: '~ condition.Time' vs '~ 1'
DataFrame with 4 rows and 6 columns
                                  baseMean log2FoldChange     lfcSE      stat       pvalue         padj
                            <numeric>      <numeric> <numeric> <numeric>    <numeric>    <numeric>
mmu-miR-145b         35.51911     -2.6590521 0.4283991  56.82489 2.800864e-12 3.431058e-09
mmu-miR-181a-2-3p   144.74966     -3.2386468 0.5431705  42.11996 3.783699e-09 1.792311e-06
> sum(lfc_resKO_T2_T1$padj < 0.05)
[1] 203
> lfc_resKO_T3_T2 <- results(ddsLRT,cooksCutoff=F, independentFiltering = F, contrast=c("condition.Time","T3","T2"))
> head(lfc_resKO_T3_T2[order(lfc_resKO_T3_T2$pvalue),],2)
log2 fold change (MLE): condition.Time T3 vs T2
LRT p-value: '~ condition.Time' vs '~ 1'
DataFrame with 4 rows and 6 columns
                     baseMean log2FoldChange     lfcSE      stat       pvalue         padj
                    <numeric>      <numeric> <numeric> <numeric>    <numeric>    <numeric>
mmu-miR-145b         35.51911     2.75662374 0.3913252  56.82489 2.800864e-12 3.431058e-09
mmu-miR-181a-2-3p   144.74966     3.08362802 0.4757690  42.11996 3.783699e-09 1.792311e-06
> sum(lfc_resKO_T3_T2$padj < 0.05)
[1] 203

Thank you and any advice?

Gene-expression RNA-Seq R • 4.0k views

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.5 years ago by bioinforupesh2009.au ▴ 140

Ram · Answer 1 · 2014-11-12

It appears you still have not read the documentation, though you have been asked to do so on previous threads (1, 2). This question is specifically discussed in ?results. Type into your R console, ?results, and then read the details.

The information which answers your question is also printed in your code above. For each comparison, the results table prints out: LRT p-value: '~ condition.Time' vs '~ 1'. So the p-value is from the likelihood ratio test of a model including time information and one without (only fitting an intercept).

The base mean is over all samples in the dataset, not the subset of samples specified by 'contrast'.