Setting Up Model in DeSeq2
1
0
Entering edit mode
4.8 years ago
lkw222 ▴ 30

I usually stick to genome seq data, so this is my first time using deseq2. I have RNAseq data from Drosophila that were fed either a control diet (C) or supplemented diet (A) while under going physiological stress. They were sequenced at three time points during the experiment. I have the following samples:

  • Flies fed control before stress, during stress, and after stress
  • Flies fed A before stress, during stress, and after stress

5 Biological replicates of each (30 total samples)

I quality checked, trimmed, aligned to reference with STAR, and got gene counts with HTseq count.

We want to know the differentially expressed genes between C and A diets at each time point. I've set up the sample table in R for deseq2 with SampleID, Time, Condition (ie diet), Sample File (ie htseq output). Now I'm ready to run the analysis but am struggling with how. Here is my code:

ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
                                   directory = directory,
                                   design = ~ condition + time)
colData(ddsHTSeq)$condition <- factor(colData(ddsHTSeq)$condition,
                                  levels = c("C", "A"))
dds <- DESeq(ddsHTSeq)
res <- results(dds, alpha = 0.05)
# save data results and normalized reads to csv
resdata <- merge(as.data.frame(res), as.data.frame(counts(dds,normalized =TRUE)), by = 'row.names', sort = FALSE)

I know this is not correct to give me the result I want. How can I set it up so that I get three lists of differentially expressed genes? For before stress, then during, then after.

Thanks in advance.

UPDATE: PCA-condition PCA-time Hist-pvalue-grouped

deseq2 rna-seq differential expression htseqcount • 1.9k views
ADD COMMENT
1
Entering edit mode
ADD COMMENT
0
Entering edit mode

Thanks for pointing me there - I was able to use the grouping variable tactic. Then I can compare group C_time0 vs group A_time0 and so on and so forth.

The only problem is, when I pull out the results using results(dds, contrast = c("group", "C_time0", "A_time0") all the adjusted p values are either 1 or NA.

ADD REPLY
0
Entering edit mode

If that's what your data is, that's what your data is.

ADD REPLY
0
Entering edit mode

Well, my concern is this. There's appears to be some large variation between samples that is not accounted for by the diet (here U vs. AO) or the time point (0, 60, 240). Also the histogram of the p values is not what I expect to see?

I've added links to graphs to the original question.

ADD REPLY
0
Entering edit mode

Sure, large amounts of noise will make it hard for the software to conclude that the difference in means between groups is significant. Your images do not work for me.

ADD REPLY
0
Entering edit mode

The p-value it is supposed to rank between 0 - 0.05 (I push till 0.5). The PCA is too disperse which is ok for the Y axes but not on the X axes which show too much variance between samples. I don't know about the experiment but maybe you could group the samples differently ?

ADD REPLY
0
Entering edit mode

Does group associate with PC3 or PC4? Because group looks like it's not affecting expression much, if at all, so comparing between groups will show you little or nothing. Does PC1 correlate with batch?

ADD REPLY

Login before adding your answer.

Traffic: 1950 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6