Question: Showing FDR and log2FC in Volcano plot using R
1
gravatar for Farbod
4 days ago by
Farbod3.0k
Toronto
Farbod3.0k wrote:

Dear Biostars, Hi

In order to bioinformaticly show DEGs of RNA-seq in a volcano plot, I intend to use R, showing DEGs with blue spots and not significantly expressed genes with red dots.

The edgeR package which is embedded in Trinity software, uses FDR (>0.001) and log2FC (>2) of the matrix file as the threshold.

I will show my R code and the head of my matrix file. My QUESTIONs are:

1- Which I should use, FDR or -log10(FDR) ? (e.g in aes(x=logFC, y =-log10(FDR),) (when I used FDR the volcano was funny and when I used -10log(FDR), it seems that the number of blue dots that are presenting the DEGs are more than number of DEGs reported by Trinity package.

2- which I should use; lof FC or log2FC?

NOTE1: The Code

library(ggplot2)
data <- read.table("trans_counts.matrix.A_vs_B.edgeR.DE_results", header=TRUE)

##Identify the genes that have a FDR < 0.001
data$threshold = as.factor(data$FDR < 0.001)

##Construct the plot object
g <- ggplot(data=data, 
            aes(x=logFC, y =-log10(FDR), 
                colour=threshold)) +
  geom_point(alpha=0.4, size=1.75) +
  xlim(c(-14, 14)) +
  xlab("log2 fold change") + ylab("-log10 FDR") +
  theme_bw() +
  theme(legend.position="none")
g

NOTE2: Head of file

> external_gene_name    logFC   logCPM  P.Value FDR
> 
> TRINITY_DN179827_c0_g2_i4 -13.8732718193368   4.61855606743036    3.03811154383156e-30    1.87897780840196e-24
> 
> TRINITY_DN194528_c5_g3_i4 13.2697070220115    4.01618042373378    1.21936893870205e-27    3.7707094407506e-22
> 
> TRINITY_DN192883_c9_g2_i2 11.2269211011449    3.82178000481344    3.20153046019011e-26    6.60015780727773e-21
volcano rna-seq R • 200 views
ADD COMMENTlink modified 4 days ago by cpad01123.6k • written 4 days ago by Farbod3.0k

I think your logFC values are is already log2. In general, volcano plots are logFC vs P-value (-logged (10), in most cases). There are manuscripts that used FDR in volcano plots. Almost similar post here: Volcano plot p-value or FDR?

ADD REPLYlink written 4 days ago by cpad01123.6k

Thanks, In "aes(x=logFC, y =-log10(FDR)" I should use FDR or -log10(FDR) ?

ADD REPLYlink written 4 days ago by Farbod3.0k
1

It makes sense (to me) to use -log10 (FDR) instead of FDR. Talk to your PI or client or the intended journal, once you are done with it and iterate it as per the feedback. Please label the axis accordingly.

ADD REPLYlink modified 4 days ago • written 4 days ago by cpad01123.6k
1

Yes, I have seen both being used, i.e., -log10(P) and -log10(FDR). Either is legitimate.

ADD REPLYlink written 4 days ago by Kevin Blighe9.0k

For question 1, I think the problem is you set the threshold on the original FDR values (data$FDR < 0.001), but then apply it to transformed values (y =-log10(FDR))

ADD REPLYlink written 4 days ago by h.mon9.6k

Yes,

In original research I used FDR = 0.001 and log2FC =2 and I want to draw same volcano.

but

when I set the threshold as "data$threshold = as.factor(data$-log10(FDR) < 0.001) " it says :"Error in eval(expr, envir, enclos) : object 'threshold' not found"

when I use "data$threshold = as.factor(data$P.Value < 0.001)" and " aes(x=logFC, y =-log10(FDR), ", the volcano is appeared.

you can also see that in Trinity, volcano is according to -log10FDR.

What do you think? what should I do ?

ADD REPLYlink modified 4 days ago • written 4 days ago by Farbod3.0k
data$threshold = as.factor(-log10(data$FDR) < 3)

Try in these lines.

ADD REPLYlink written 4 days ago by cpad01123.6k

Hi, I tried it,

The blue spots change to red and red dot changed to blue!

ADD REPLYlink written 4 days ago by Farbod3.0k
1

my bad. It should be >, not <.

data$threshold = as.factor(data$FDR < 0.001)
data$threshold = as.factor(-log10(data$FDR) > 3)

they must be same. Looking back, this should not make any difference. Your original code should still be good as threshold (T/F) is on a list of genes.

ADD REPLYlink modified 4 days ago • written 4 days ago by cpad01123.6k

Hi @cpad0112, I have another question and SORRY if it is not a bioinformatic one,

Is there any way to change these red and blue colours of dots (e.g into red and green)?

as I could not guess which part of my R script is resbomsible for that by looking at it's line of command.

Thanks

ADD REPLYlink written 14 hours ago by Farbod3.0k
1

Try adding following line ( as a separate line)

scale_color_manual(values = c("green", "red")) +
ADD REPLYlink written 12 hours ago by cpad01123.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1497 users visited in the last hour