Question: Understanding DESeq2 padj and Volcano Plot
0
gravatar for juan.crescente
7 months ago by
juan.crescente40 wrote:

I'm using DESeq2 for a differential expression analysis of sRNA data. The R code looks like this:

library("DESeq2")
alpha <- 0.05
setwd(dirname(rstudioapi::getActiveDocumentContext()$path));
getwd()
data_path <- "/home/juan/Desktop/juan/bio/mirna_mrcv/data/counts.valid.csv"
result_path <- "/home/juan/Desktop/juan/bio/mirna_mrcv/data/"
# Load data
countdata <- read.table(data_path,header=TRUE,sep="\t")
result_file = paste(result_path,"mirna.deg.csv",sep="");
#DROP unique miRNA clusters
# I don't know why I have to do this first, otherwise RStudio hangs
countdata <- countdata[!is.na(countdata$Name),]
row.names(countdata) <- countdata$Name
countdata <- subset(countdata, select = -c(main))
countdata <- subset(countdata, select = -c(Name))
countdata <- subset(countdata, select = -c(Locus))
countdata <- as.matrix(countdata)
head(countdata)
# Assign condition (first four are controls, second four contain the expansion)
(condition <- factor(c("control","control","treatment","treatment")))
# Analysis with DESeq2 ----------------------------------------------------
# Create a coldata frame and instantiate the DESeqDataSet. See ?DESeqDataSetFromMatrix
(coldata <- data.frame(row.names=colnames(countdata), condition))
levels(coldata$condition)
coldata
dds_m <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~condition)
# Run the DESeq pipeline
dds <- DESeq(dds_m)

res <- results(dds, alpha=alpha)
res <- res[!is.na(res$padj), ]
res <- res[res$padj <= alpha, ]
table(res$padj <= alpha)
## Order by adjusted p-value
res <- res[order(res$padj), ]
## Merge with normalized count data
resdata <- merge(as.data.frame(res), as.data.frame(counts(dds, normalized=TRUE)), by="row.names", sort=FALSE)
resdata
names(resdata)[1] <- "Name"
head(resdata)
## Write results
write.csv(resdata, file=result_file, row.names=FALSE)
result_file

For this, I'm filtering using padj < 0.05, which gives me 11 results.

When I create the volcano plot using EnhancedVolcano, I use padj in y axis and filter by 0.05 Which is supposed to filter with pCutOff ("In this case, the cutoff for the P value then relates to the adjusted P value")

library(EnhancedVolcano)
dds <- DESeq(dds_m, betaPrior=FALSE)
res1 <- results(dds)
res1
EnhancedVolcano(res1,
                lab = rownames(res1),
                x = 'log2FoldChange',
                y = 'padj',
                pCutoff = 0.05)

But I'm not undestanding where my 11 elements should be, I guess from the horizontal line but how is DESeq2 filtering by log 2 fold change, how is it selecting where to cut?

enter image description here

ADD COMMENTlink modified 7 months ago • written 7 months ago by juan.crescente40

Hey, I developed EnhancedVolcano. Yes, the other genes likely have absolute log [base 2] fold changes < 1. You can check in your res1 object

ADD REPLYlink written 7 months ago by Kevin Blighe56k

hey thanks for answering! Should I select all values below padj<0.05 no matter the log2foldchange? Because DESeq2 is selecting all below that and it's inconsisten with the volcano plot, which seems to suggest both variables

ADD REPLYlink written 7 months ago by juan.crescente40

You could set the following:

FCcutoff = 0.0

...however, I am not sure how that would look.

ADD REPLYlink written 7 months ago by Kevin Blighe56k

Can you point me what is the correct way to do the analysis with DESeq2? Should I use a FC cutoff prior to doing the volcano plot in DESeq2?

ADD REPLYlink written 7 months ago by juan.crescente40

Typically, we use FDR Q<0.05 (adjusted P<0.05) __AND__ absolute log2FC>2 as cutoffs in a differential expression study. Volcano plots are then usually generated using the nominal (un-adjusted) p-values. In your experiment, however, you have nothing that has an absolute log2FC>2 ... ...

In your plot, also, if you plot the adjusted p-values, then you need to change the value of ylab to:

ylab = bquote(~-Log[10]~italic(Q))

Also, in your DESeq2 code, if you use betaPrior = FALSE, then you should also make use of the lfcshrink() function; alternatively, just use:

For example, take a look at my 2 previous answers:

ADD REPLYlink modified 7 months ago by h.mon29k • written 7 months ago by Kevin Blighe56k

very helpful! thank you

ADD REPLYlink written 7 months ago by juan.crescente40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1469 users visited in the last hour