Question

Setting Up Model in DeSeq2

0

Entering edit mode

4.8 years ago

lkw222 ▴ 30

I usually stick to genome seq data, so this is my first time using deseq2. I have RNAseq data from Drosophila that were fed either a control diet (C) or supplemented diet (A) while under going physiological stress. They were sequenced at three time points during the experiment. I have the following samples:

Flies fed control before stress, during stress, and after stress
Flies fed A before stress, during stress, and after stress

5 Biological replicates of each (30 total samples)

I quality checked, trimmed, aligned to reference with STAR, and got gene counts with HTseq count.

We want to know the differentially expressed genes between C and A diets at each time point. I've set up the sample table in R for deseq2 with SampleID, Time, Condition (ie diet), Sample File (ie htseq output). Now I'm ready to run the analysis but am struggling with how. Here is my code:

ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
                                   directory = directory,
                                   design = ~ condition + time)
colData(ddsHTSeq)$condition <- factor(colData(ddsHTSeq)$condition,
                                  levels = c("C", "A"))
dds <- DESeq(ddsHTSeq)
res <- results(dds, alpha = 0.05)
# save data results and normalized reads to csv
resdata <- merge(as.data.frame(res), as.data.frame(counts(dds,normalized =TRUE)), by = 'row.names', sort = FALSE)

I know this is not correct to give me the result I want. How can I set it up so that I get three lists of differentially expressed genes? For before stress, then during, then after.

Thanks in advance.

UPDATE:

deseq2 rna-seq differential expression htseqcount • 1.9k views

ADD COMMENT • link updated 4.8 years ago by GenoMax 141k • written 4.8 years ago by lkw222 ▴ 30

score 1 · Accepted Answer · 2019-07-11

1

Entering edit mode

4.8 years ago

swbarnes2 14k

Do as suggested in the vignette

http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#interactions

ADD COMMENT • link 4.8 years ago by swbarnes2 14k

0

Entering edit mode

Thanks for pointing me there - I was able to use the grouping variable tactic. Then I can compare group C_time0 vs group A_time0 and so on and so forth.

The only problem is, when I pull out the results using results(dds, contrast = c("group", "C_time0", "A_time0") all the adjusted p values are either 1 or NA.

ADD REPLY • link 4.8 years ago by lkw222 ▴ 30

0

Entering edit mode

If that's what your data is, that's what your data is.

ADD REPLY • link 4.8 years ago by swbarnes2 14k

0

Entering edit mode

Well, my concern is this. There's appears to be some large variation between samples that is not accounted for by the diet (here U vs. AO) or the time point (0, 60, 240). Also the histogram of the p values is not what I expect to see?

I've added links to graphs to the original question.

ADD REPLY • link 4.8 years ago by lkw222 ▴ 30

0

Entering edit mode

Sure, large amounts of noise will make it hard for the software to conclude that the difference in means between groups is significant. Your images do not work for me.

ADD REPLY • link 4.8 years ago by swbarnes2 14k

0

Entering edit mode

The p-value it is supposed to rank between 0 - 0.05 (I push till 0.5). The PCA is too disperse which is ok for the Y axes but not on the X axes which show too much variance between samples. I don't know about the experiment but maybe you could group the samples differently ?

ADD REPLY • link 4.8 years ago by Morris_Chair ▴ 350

0

Entering edit mode

Does group associate with PC3 or PC4? Because group looks like it's not affecting expression much, if at all, so comparing between groups will show you little or nothing. Does PC1 correlate with batch?

ADD REPLY • link 4.8 years ago by swbarnes2 14k