Question: Setting Up Model in DeSeq2
gravatar for lkw222
10 days ago by
lkw22230 wrote:

I usually stick to genome seq data, so this is my first time using deseq2. I have RNAseq data from Drosophila that were fed either a control diet (C) or supplemented diet (A) while under going physiological stress. They were sequenced at three time points during the experiment. I have the following samples:

  • Flies fed control before stress, during stress, and after stress
  • Flies fed A before stress, during stress, and after stress

5 Biological replicates of each (30 total samples)

I quality checked, trimmed, aligned to reference with STAR, and got gene counts with HTseq count.

We want to know the differentially expressed genes between C and A diets at each time point. I've set up the sample table in R for deseq2 with SampleID, Time, Condition (ie diet), Sample File (ie htseq output). Now I'm ready to run the analysis but am struggling with how. Here is my code:

ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
                                   directory = directory,
                                   design = ~ condition + time)
colData(ddsHTSeq)$condition <- factor(colData(ddsHTSeq)$condition,
                                  levels = c("C", "A"))
dds <- DESeq(ddsHTSeq)
res <- results(dds, alpha = 0.05)
# save data results and normalized reads to csv
resdata <- merge(,,normalized =TRUE)), by = 'row.names', sort = FALSE)

I know this is not correct to give me the result I want. How can I set it up so that I get three lists of differentially expressed genes? For before stress, then during, then after.

Thanks in advance.

UPDATE: PCA-condition PCA-time Hist-pvalue-grouped

ADD COMMENTlink modified 9 days ago by genomax70k • written 10 days ago by lkw22230
gravatar for swbarnes2
10 days ago by
United States
swbarnes26.0k wrote:

Do as suggested in the vignette

ADD COMMENTlink modified 10 days ago • written 10 days ago by swbarnes26.0k

Thanks for pointing me there - I was able to use the grouping variable tactic. Then I can compare group C_time0 vs group A_time0 and so on and so forth.

The only problem is, when I pull out the results using results(dds, contrast = c("group", "C_time0", "A_time0") all the adjusted p values are either 1 or NA.

ADD REPLYlink written 9 days ago by lkw22230

If that's what your data is, that's what your data is.

ADD REPLYlink written 9 days ago by swbarnes26.0k

Well, my concern is this. There's appears to be some large variation between samples that is not accounted for by the diet (here U vs. AO) or the time point (0, 60, 240). Also the histogram of the p values is not what I expect to see?

I've added links to graphs to the original question.

ADD REPLYlink modified 9 days ago • written 9 days ago by lkw22230

Sure, large amounts of noise will make it hard for the software to conclude that the difference in means between groups is significant. Your images do not work for me.

ADD REPLYlink written 9 days ago by swbarnes26.0k

The p-value it is supposed to rank between 0 - 0.05 (I push till 0.5). The PCA is too disperse which is ok for the Y axes but not on the X axes which show too much variance between samples. I don't know about the experiment but maybe you could group the samples differently ?

ADD REPLYlink written 9 days ago by Morris_Chair120

Does group associate with PC3 or PC4? Because group looks like it's not affecting expression much, if at all, so comparing between groups will show you little or nothing. Does PC1 correlate with batch?

ADD REPLYlink written 8 days ago by swbarnes26.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1997 users visited in the last hour