Question: DESeq: obtaining DEGs from uneven dataset (some with replicates, some without)
0
gravatar for exin
5 months ago by
exin50
Australia
exin50 wrote:

I'm trying to perform differential gene expression analyses using DESeq, on a pre-existing dataset, where 8 cell types (A to H) were identified from MARS-Seq on whole organism. Gene expression counts are given for

A x 9 replicates
B x 1
C x 1
D x 3
E x 1
F x 1
G x 4
H x 1

In order to find genes differentially expressed in A, I contrast A against "notA" (replacing B to H with "notA")

dds<-DESeqDataSetFromMatrix(countData = cts, colData = coldata,design=~condition)
dds$condition<-factor(dds$condition,levels=c("A","notA"))
dds<-DESeq(dds)
alpha<-0.1
res<-results(dds,contrast=c("condition","A","notA"),alpha=alpha) 
reslfc<-lfcShrink(dds,contrast=c("condition","A","notA"),res=res, alpha=alpha)

and so on for the other 7 cell types.

However, I found that I get much more DEGs for A D & G, i.e. samples with more reps. In fact, I have no DEGs at all for B, C E, F and H.

Anyway I can overcome this? Or should I not be performing the analysis this way?

rna-seq gene expression • 229 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by exin50
1

Statistical power is (among other factors) a function of the replication number. Please use google and the search function for DESeq2-related questions on replication, replicate numbers and experiments with no replicates. You'll literally find dozens of posts, blogs, bioconductor questions and responses from (among others) Mike Love (the first author and maintainer of DESeq2).

ADD REPLYlink written 5 months ago by ATpoint21k

Thanks ATpoint. Besides treating it as an exploratory analysis / not drawing firm conclusions from the dataset, I'm wondering how exactly to augment the test / result, by accepting the assumption that variance-mean estimates from replicated samples can be applied to unreplicated ones?

ADD REPLYlink written 5 months ago by exin50

I think (but I am really not sure) that DESeq2 uses the dispersion estimates between the replicated group and projects them to the unreplicated one, but again, this is only something I faintly remember from a bioconductor post so better double check this.

ADD REPLYlink written 5 months ago by ATpoint21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1477 users visited in the last hour