Question: Understanding DESeq2 padj and Volcano Plot
0
gravatar for juan.crescente
10 weeks ago by
juan.crescente20 wrote:

I'm using DESeq2 for a differential expression analysis of sRNA data. The R code looks like this:

library("DESeq2")
alpha <- 0.05
setwd(dirname(rstudioapi::getActiveDocumentContext()$path));
getwd()
data_path <- "/home/juan/Desktop/juan/bio/mirna_mrcv/data/counts.valid.csv"
result_path <- "/home/juan/Desktop/juan/bio/mirna_mrcv/data/"
# Load data
countdata <- read.table(data_path,header=TRUE,sep="\t")
result_file = paste(result_path,"mirna.deg.csv",sep="");
#DROP unique miRNA clusters
# I don't know why I have to do this first, otherwise RStudio hangs
countdata <- countdata[!is.na(countdata$Name),]
row.names(countdata) <- countdata$Name
countdata <- subset(countdata, select = -c(main))
countdata <- subset(countdata, select = -c(Name))
countdata <- subset(countdata, select = -c(Locus))
countdata <- as.matrix(countdata)
head(countdata)
# Assign condition (first four are controls, second four contain the expansion)
(condition <- factor(c("control","control","treatment","treatment")))
# Analysis with DESeq2 ----------------------------------------------------
# Create a coldata frame and instantiate the DESeqDataSet. See ?DESeqDataSetFromMatrix
(coldata <- data.frame(row.names=colnames(countdata), condition))
levels(coldata$condition)
coldata
dds_m <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~condition)
# Run the DESeq pipeline
dds <- DESeq(dds_m)

res <- results(dds, alpha=alpha)
res <- res[!is.na(res$padj), ]
res <- res[res$padj <= alpha, ]
table(res$padj <= alpha)
## Order by adjusted p-value
res <- res[order(res$padj), ]
## Merge with normalized count data
resdata <- merge(as.data.frame(res), as.data.frame(counts(dds, normalized=TRUE)), by="row.names", sort=FALSE)
resdata
names(resdata)[1] <- "Name"
head(resdata)
## Write results
write.csv(resdata, file=result_file, row.names=FALSE)
result_file

For this, I'm filtering using padj < 0.05, which gives me 11 results.

When I create the volcano plot using EnhancedVolcano, I use padj in y axis and filter by 0.05 Which is supposed to filter with pCutOff ("In this case, the cutoff for the P value then relates to the adjusted P value")

library(EnhancedVolcano)
dds <- DESeq(dds_m, betaPrior=FALSE)
res1 <- results(dds)
res1
EnhancedVolcano(res1,
                lab = rownames(res1),
                x = 'log2FoldChange',
                y = 'padj',
                pCutoff = 0.05)

But I'm not undestanding where my 11 elements should be, I guess from the horizontal line but how is DESeq2 filtering by log 2 fold change, how is it selecting where to cut?

enter image description here

ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by juan.crescente20

Hey, I developed EnhancedVolcano. Yes, the other genes likely have absolute log [base 2] fold changes < 1. You can check in your res1 object

ADD REPLYlink written 10 weeks ago by Kevin Blighe51k

hey thanks for answering! Should I select all values below padj<0.05 no matter the log2foldchange? Because DESeq2 is selecting all below that and it's inconsisten with the volcano plot, which seems to suggest both variables

ADD REPLYlink written 10 weeks ago by juan.crescente20

You could set the following:

FCcutoff = 0.0

...however, I am not sure how that would look.

ADD REPLYlink written 10 weeks ago by Kevin Blighe51k

Can you point me what is the correct way to do the analysis with DESeq2? Should I use a FC cutoff prior to doing the volcano plot in DESeq2?

ADD REPLYlink written 10 weeks ago by juan.crescente20

Typically, we use FDR Q<0.05 (adjusted P<0.05) __AND__ absolute log2FC>2 as cutoffs in a differential expression study. Volcano plots are then usually generated using the nominal (un-adjusted) p-values. In your experiment, however, you have nothing that has an absolute log2FC>2 ... ...

In your plot, also, if you plot the adjusted p-values, then you need to change the value of ylab to:

ylab = bquote(~-Log[10]~italic(Q))

Also, in your DESeq2 code, if you use betaPrior = FALSE, then you should also make use of the lfcshrink() function; alternatively, just use:

For example, take a look at my 2 previous answers:

ADD REPLYlink modified 10 weeks ago by h.mon28k • written 10 weeks ago by Kevin Blighe51k

very helpful! thank you

ADD REPLYlink written 10 weeks ago by juan.crescente20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1430 users visited in the last hour