Question

FDR values in clusterProfiler

0

Entering edit mode

3.5 years ago

karlaarz ▴ 90

Hello,

I have the following human gene list:

CCT3 ABCA4 HEG1 PPARGC1B ADD1 CEP85 SLC1A4 DUSP10 PLAGL2 UBE2G2 NTRK2 PPIP5K1 DDB1 PRPH2 OAZ2 PEA15 ICMT KDM4A NCOA6 ZNF609 AKAP1 SYNE3 CAMSAP1 POLE4 ZDHHC5 ANGEL1 KCNJ14 NDUFA8 SIPA1L2 BTD CCT7 ANO2

I did the enrichment analysis using clusterProfiler

gobp <-enrichGO(glist,OrgDb=org.Hs.eg.db, ont = "BP",  pAdjustMethod = "fdr", keyType = 'SYMBOL', pvalueCutoff = 0.05)

head(gobp)
                   ID                                                              Description GeneRatio   BgRatio       pvalue   p.adjust    qvalue                              geneID Count
GO:1903405 GO:1903405                                     protein localization to nuclear body      2/36  10/18670 0.0001611010 0.02351193 0.0197561                           CCT3/CCT7     2
GO:1904851 GO:1904851 positive regulation of establishment of protein localization to telomere      2/36  10/18670 0.0001611010 0.02351193 0.0197561                           CCT3/CCT7     2
GO:1904867 GO:1904867                                       protein localization to Cajal body      2/36  10/18670 0.0001611010 0.02351193 0.0197561                           CCT3/CCT7     2
GO:0060249 GO:0060249                                         anatomical structure homeostasis      6/36 439/18670 0.0001748712 0.02351193 0.0197561 ABCA4/CCT3/CCT7/POLE4/ADD1/PPARGC1B     6
GO:0070203 GO:0070203          regulation of establishment of protein localization to telomere      2/36  11/18670 0.0001966624 0.02351193 0.0197561                           CCT3/CCT7     2
GO:0070202 GO:0070202        regulation of establishment of protein localization to chromosome      2/36  12/18670 0.0002357086 0.02351193 0.0197561                           CCT3/CCT7     2

However, if I do the same analysis using Panther or David, I don't find statistically relevant results as all FDR values are equal to 1. I know that cluster profiler calculates FDR values following the Storey, 2002 paper, but I wouldn't expect to see a big difference.

1) Why does clusterProfiler show different FDR values from Panther and David?

Thanks!

RNA-Seq clusterProfiler • 1.5k views

ADD COMMENT • link updated 3.4 years ago by Biostar 20 • written 3.5 years ago by karlaarz ▴ 90

0

Entering edit mode

I don't think the tools you listed use the same statistical test. I believe DAVID uses a modified fishers exact test, PANTHER uses a binomial test, and clusterProfiler uses the hypergeometric test. This would result in different p-values for each tool, and of course different FDR corrected p-values.

There are other factors to consider too, such as the versions of the GO ontology database used, and the "universe" of genes used in the statistical calculations (the total number of genes that are considered from the genome).

ADD REPLY • link 3.5 years ago by rpolicastro 13k