Performing Differential expression analysis using DESeq with 3 conditions
2
0
Entering edit mode
3.6 years ago
bionewbie • 0

I have Gene Id as ID column & Control(WT), 2 mutated genes (MA and MD) as Conditions. Each condition has 3 columns of data (WT -3, MA-3, MD-3)

My goal is to compare MA vs WT and MD vs WT - perform differential expression analysis and pathway enrichment analysis.

counts - counts data;
sample_data - metadata; 
condition - WT, MA, MD

dds <- DESeqDataSetFromMatrix(countData = counts,
                              colData = sample_data,
                              design = ~ condition)

dds = DESeq(dds)  # main function

resultsNames(dds)

However when I enter the following command it says - "Intercept" "cell_line_MD_vs_MA" "cell_line_WT_vs_MA" But I want the comparison to be about MA vs WT and MD vs WT. I am able to use the contrast option to compare the conditions but when i try to use lfcShrink(dds, coef= "cell_line_MA_vs_WT", type="apeglm") it does not work. Please let me know how I can apply lfc in this case when the dds results are by default - "cell_line_MD_vs_MA" "cell_line_WT_vs_MA".

RNA-Seq Deseq2 R next-gen bioconductor • 1.5k views
ADD COMMENT
2
Entering edit mode
3.6 years ago
ATpoint 84k

It is contrasts what you need, please check the manual for that.

ADD COMMENT
0
Entering edit mode

Thanks for your previous answer but I have updated the question and it would be great if you could help me by answering it. My dds results are by default - "cell_line_MD_vs_MA" "cell_line_WT_vs_MA" but I am able to apply contrasts to do a comparison across conditions but when I try to apply logfcshrinkage it does not work because the conditions I am looking for are - "cell_line_MA_vs_WT", "cell_line_MD_vs_WT".

ADD REPLY
1
Entering edit mode
sample_data$condition <- relevel(as.factor(sample_data$condition), "WT")

DESeq2 will use the first factor level as the default level for the regression, so if you change the first factor level to WT it should fix your problem.

ADD REPLY
1
2
Entering edit mode
3.6 years ago

If you want to use names you can reorder the factor level of condition too.

sample_data$condition <- relevel(as.factor(sample_data$condition), "WT")
ADD COMMENT
0
Entering edit mode

Thank you! It worked. I used the following commands in this sequence.

res <- lfcShrink(dds, coef= "cell_line_MD_vs_WT", type="apeglm")

res <- results(dds, alpha = 0.5, contrast=c("cell_line","MD","WT"))

resSig = as.data.frame(subset(res,padj<0.5) )

resSig = resSig[order(resSig$log2FoldChange,decreasing=TRUE),]

And then I am using GOStats to select the upregulated genes:

selectedGenes = unique(resSig[resSig$log2FoldChange>1,'entrez'])  # upregulated genes

universeGenes =  unique( mapIds(org.Hs.eg.db,
                     keys= res$ensembl,
                     column="ENTREZID",
                     keytype="ENSEMBL", #Out ID is ENSMBL
                     multiVals="first")
                    )



hgCutoff <- 0.001
 params <- new("GOHyperGParams",
     geneIds=selectedGenes,
     universeGeneIds=universeGenes,
     annotation="org.Hs.eg.db",
     ontology="BP",
     pvalueCutoff=hgCutoff,
     conditional=FALSE,
     testDirection="over")


hgOver <- hyperGTest(params)

It would be great if you could let me know how much should be the range for resSig$log2FoldChange> x for upregulated and downregulated genes.

ADD REPLY
1
Entering edit mode

Your adjust p-value threshold (alpha) should be 0.05, not 0.5.

You can also set the lfcThreshold argument in the lfcShrink function instead of post-hoc filtering of FCs. The most common threshold value is usually usually 1.

ADD REPLY
0
Entering edit mode

Got it. So in my selectedGenes variable

selectedGenes = unique(resSig[resSig$log2FoldChange>1,'entrez']) # upregulated genes

And,

selectedGenes = unique(resSig[resSig$log2FoldChange<1,'entrez']) # downregulated genes

Is this correct?

ADD REPLY
1
Entering edit mode

In your code you have

res <- lfcShrink(dds, coef= "cell_line_MD_vs_WT", type="apeglm")

res <- results(dds, alpha = 0.5, contrast=c("cell_line","MD","WT"))

You are currently overwriting your Log2FC shrinkage results since since you saving the results function output to the same varaible. Instead of the above two lines just do:

res <- lfcShrink(dds, coef= "cell_line_MD_vs_WT", type="apeglm", lfcThreshold=1)
ADD REPLY
0
Entering edit mode

Thanks for your response. I am not getting many values if I keep a threshold of 1.

So I just used res <- lfcShrink(dds, coef= "cell_line_MD_vs_WT", type="apeglm")

summary(res)

adjusted p-value < 0.1

LFC > 0 (up) : 3055, 14%

LFC < 0 (down) : 3301, 15%

Does this mean LFC above zero is upregulated and LFC below zero is downregulated? If this statement is true then will the following lines of code get me the upregulated genes?

resSig = resSig[order(resSig$log2FoldChange,decreasing=TRUE),]

And then I am using GOStats to select the upregulated genes:

selectedGenes = unique(resSig[resSig$log2FoldChange>0,'entrez']) # upregulated genes

universeGenes = unique( mapIds(org.Hs.eg.db, keys= res$ensembl, column="ENTREZID", keytype="ENSEMBL", #Out ID is ENSMBL multiVals="first") )

hgCutoff <- 0.001 params <- new("GOHyperGParams", geneIds=selectedGenes, universeGeneIds=universeGenes, annotation="org.Hs.eg.db", ontology="BP", pvalueCutoff=hgCutoff, conditional=FALSE, testDirection="over")

hgOver <- hyperGTest(params)

ADD REPLY

Login before adding your answer.

Traffic: 1061 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6