Volcano plot with FDR values looks weird
0
0
Entering edit mode
2.0 years ago
lluc.cabus ▴ 20

Hi everyone,

I'm doing a differential gene expression between two different groups and I have found that the volcano plot looks very weird. I think that this comes from the FDR transformation of the p-values. I have seen there are 21292 unique p-values but there are only 306 unique FDR values. Can you see what I did wrong to have these results?

Thank you very much!

The code to do the differential gene expression and the figure is as follows:

dgList <- DGEList(counts=FL_count_matrix[,1:6], genes=rownames(FL_count_matrix), group = as.factor(FL_metadata_EDTA$Spin_type))
dgList <- calcNormFactors(dgList, method="TMM")
dgList <- estimateCommonDisp(dgList, verbose = T)
dgList <- estimateGLMTrendedDisp(dgList, verbose = T)
dgList <- estimateGLMTrendedDisp(dgList, verbose = T)
dgList <- estimateGLMTagwiseDisp(dgList)
fit <- glmQLFit(dgList)
qlf <- glmQLFTest(fit, coef=2)
FDR= p.adjust(qlf$table$PValue, method="fdr")
qlf$table$FDR= FDR
qlf$table$minus_log10FDR= -log10(FDR)
p <- ggplot(data=qlf$table, aes(x=logFC, y=minus_log10FDR)) + geom_point()+
  geom_hline(yintercept=1, linetype="dashed", color = "red")+ geom_vline(xintercept=c(-1,1), linetype="dashed", color = "red")
p

enter image description here

EdgeR expression gene plot volcano differential R • 862 views
ADD COMMENT
0
Entering edit mode

You simply do not seem to have any DEGs, and the FDR correction pinned all the pvalues to the same (or similar) FDR which is not unusual. Use QC strategies such as PCA or MDS (see edgeR vignette) to diagnose if confounding factors are responsible for this. Also check MA-plots. You have large fold changes in your data, so explore why these are not significant. Maybe batch effects? How many samples is it per group? 3vs3?

ADD REPLY
0
Entering edit mode

Okay, I also have another comparison where I have multiple differentially expressed genes but I also have something similar, this is also normal? I add the volcano plot below. I have used exactly the same code as above, but with different samples.

enter image description here

ADD REPLY
0
Entering edit mode

Sorry, I didn't see the update on the comment. I used 3vs3 here. Batch effect is not an option, since they were processed and sequenced together. In the PCA I see that they form clusters, so it is normal that they have high logFC. However, I don't understand why there are not genes with low FDR values. Could it be because the size of the data is very small and the number of genes is very high? I didn't filter any of the genes, so there are 60308 genes

ADD REPLY

Login before adding your answer.

Traffic: 2223 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6