Correct way to make multiple comparisons on DESeq2?
1
0
Entering edit mode
10 months ago
jbnrodriguez ▴ 30

I have a project where I have done RNA-seq (paired-end sequencing on Illumina HiSeq) of a worm at different days of development i.e. Ages 0-12. For each age, I have sequenced 3 replicate specimens. I'm new to DESeq2 and I was wondering if what I did below is correct.

library(DESeq2)
count_file_names <- grep("counts",list.files("featureCounts"),value=T)
specimen_type <- c("Age0","Age2","Age4","Age6","Age8","Age10","Age12")
sample_information <- data.frame(sampleName = count_file_names, fileName = count_file_names, condition = specimen_type)


Note- I switched to featureCounts instead of HTSeq for generating the counts but I formatted the featureCounts output files to reflect the HTSeq count output format. As far as I know, DESeq2 needs just counts and doesn't discriminate between the two count generating tools

DESeq_data <- DESeqDataSetFromHTSeqCount(sampleTable = sample_information, directory = "featureCounts", design = ~condition)
colData(DESeq_data)$condition <- factor(colData(DESeq_data)$condition,levels = c('Age0','Age2','Age4','Age6','Age8','Age10','Age12'))
rld <- rlogTransformation(DESeq_data, blind=T)


Note - I did relevel below based on Age0 even though it doesn't represent my control samples or anything. I wonder if this relevel step can be skipped though?

DESeq_data$condition <- relevel(DESeq_data$condition, "Age0")
DESeq_data <- DESeq(DESeq_data)


Below, I hope to know what genes are up-/down-regulated in Day12 worm specimens when compared to the rest. My real question is whether what I did below is correct/sensible? I plan to do in the same way for the other age specimens too.

Age_12_result <- results(DESeq_data, contrast = c(-1/7,-1/7,-1/7,-1/7,-1/7,-1/7,1/7), pAdjustMethod="BH", lfcThreshold=0, independentFiltering=T)
sig_de_results <- subset(Age_12_result, abs(log2FoldChange)> 1 & padj < 0.05)
sig_de_results <- sig_de_results[order(sig_de_results$padj, decreasing=F),]  My hope is that 'sig_de_results' tells me what genes were up-/down-regulated in Day12 specimens versus the average expression of genes in all of the remainder age specimens. Thanks a zillion in advance! expression differential DESeq DESeq2 RNA-seq • 453 views ADD COMMENT 2 Entering edit mode 9 months ago jbnrodriguez ▴ 30 I solved this myself with the below code. ## Compare age 12 versus age 0 with respect to the average of the rest of the comparisons count_file_names <- grep("counts",list.files("featureCounts"),value=T) specimen_type <- c("Age0","Age0","Age0","Age0","Age0","Age10","Age10","Age10","Age10","Age10","Age12","Age12","Age12","Age12","Age12","Age2","Age2","Age2","Age2","Age2","Age4","Age4","Age4","Age4","Age4","Age6","Age6","Age6","Age6","Age6","Age8","Age8","Age8","Age8","Age8") sample_information <-data.frame(sampleName = count_file_names, fileName = count_file_names, condition = specimen_type) DESeq_data <- DESeqDataSetFromHTSeqCount(sampleTable = sample_information, directory = "featureCounts", design = ~condition) dds_lrt <- DESeq(DESeq_data, test="LRT", reduced = ~ 1)  ## To compare one condition versus the average of the rest res12v0All_LRT <- results(dds_lrt, contrast= list (c("condition_Age12_vs_Age0"), c("condition_Age10_vs_Age0","condition_Age2_vs_Age0","condition_Age4_vs_Age0","condition_Age6_vs_Age0","condition_Age8_vs_Age0")), listValues=c(1,-1/7))  ## When I do head(res12v0All_LRT), I get the below ## log2 fold change (MLE): condition_Age12_vs_Age0 vs 0.143 condition_Age10_vs_Age0+condition_Age2_vs_Age0+condition_Age4_vs_Age0+condition_Age6_vs_Age0+condition_Age8_vs_Age0 ## LRT p-value: '~ condition' vs '~ 1' ## DataFrame with 6 rows and 6 columns sig_de_results <- subset(res12v0All_LRT, abs(log2FoldChange)> 1 & padj < 0.05) sig_de_results <- sig_de_results[order(sig_de_results$padj, decreasing=F),]

0
Entering edit mode

Note- I had more than 3 replicates in most of my sample age groups (I had stated the replicate number as being 3 in my initial question)