Question

How to select background metabolite sets for differential metabolite enrichment analysis?

0

Entering edit mode

7 hours ago

Edward • 0

Hello everyone!

I used the following code to perform differential metabolite enrichment analysis.

library(clusterProfiler)
library(KEGGREST)

metabolite_list <- c("C00031", "C00022", "C00186", "C00042", "C00149")

KEGG_COMPOUND_ID <- unique(df_sig_comp$KEGG_COMPOUND_ID)

kegg_compound2pathway <- KEGGREST::keggLink("pathway", "compound")
kegg_pathway2compound <- split(names(kegg_compound2pathway),
                               kegg_compound2pathway)
kegg_pathway2compound_stack <- stack(kegg_pathway2compound)[, 2:1] |>
  mutate(values=str_replace(values,"cpd:",""))

enrich_res <- clusterProfiler::enricher(
  gene = metabolite_list,             
  TERM2GENE =kegg_pathway2compound_stack ,
  pvalueCutoff = 0.05,
  qvalueCutoff = 0.2
)

head(summary(enrich_res))
dotplot(enrich_res, showCategory = 10)

The results are as follows:

                        ID   Description GeneRatio BgRatio RichFactor FoldEnrichment   zScore       pvalue     p.adjust       qvalue
path:map04922 path:map04922 path:map04922       5/5 26/6589 0.19230769      253.42308 35.53705 6.365563e-13 5.156106e-11 9.380830e-12
path:map05230 path:map05230 path:map05230       5/5 37/6589 0.13513514      178.08108 29.76480 4.218197e-12 1.708370e-10 3.108145e-11
path:map00620 path:map00620 path:map00620       4/5 32/6589 0.12500000      164.72500 25.58317 2.283698e-09 6.165984e-08 1.121816e-08
path:map02020 path:map02020 path:map02020       4/5 56/6589 0.07142857       94.12857 19.28579 2.325710e-08 4.709563e-07 8.568405e-08
path:map04066 path:map04066 path:map04066       3/5 15/6589 0.20000000      263.56000 28.05280 9.521691e-08 1.542514e-06 2.806393e-07
path:map00020 path:map00020 path:map00020       3/5 20/6589 0.15000000      197.67000 24.27283 2.382935e-07 3.216962e-06 5.852823e-07
                                          geneID Count
path:map04922 C00031/C00022/C00186/C00042/C00149     5
path:map05230 C00031/C00022/C00186/C00042/C00149     5
path:map00620        C00022/C00186/C00042/C00149     4
path:map02020        C00031/C00022/C00042/C00149     4
path:map04066               C00031/C00022/C00186     3
path:map00020               C00022/C00042/C00149     3

The content of kegg_pathway2compound_stack is as follows:

> head(kegg_pathway2compound_stack)
            ind values
1 path:map00010 C00022
2 path:map00010 C00024
3 path:map00010 C00031
4 path:map00010 C00033
5 path:map00010 C00036
6 path:map00010 C00068

I noticed that the URL used by KEGGREST::keggLink("pathway", "compound") is https://rest.kegg.jp/link/compound/pathway

path:map00010   cpd:C00022
path:map00010   cpd:C00024
path:map00010   cpd:C00031
path:map00010   cpd:C00033
path:map00010   cpd:C00036

> head(kegg_compound2pathway)
     cpd:C00022      cpd:C00024      cpd:C00031      cpd:C00033      cpd:C00036      cpd:C00068 
"path:map00010" "path:map00010" "path:map00010" "path:map00010" "path:map00010" "path:map00010"

The genes for pathway Glycerophospholipid metabolism are different in different species, so are the metabolites in kegg species specific?

enter image description here

The species-specific background gene set for differentially expressed genes can be obtained from the interface below. How should the background gene set for metabolites be selected?

enrichment ORA • 46 views

ADD COMMENT • link 7 hours ago by Edward • 0