Question

CluserProfiler message "No gene can be mapped"

2

Entering edit mode

6.4 years ago

ARich ▴ 130

Hi Biostar users,

I am working with clusterprofiler enrichKegg function

KEGG_all = enrichKEGG(regulated.gene$entrez, organism="human")

I have used library(org.Hs.eg.db) to convert gene names(Symbols) to Entrez ID which is the possible input to use this function.

However I am seeing strange message and I am not sure what could be the reason?

--> No gene can be mapped....
--> Expected input gene ID: 7364,127,574537,5538,4351,221
--> return NULL...

Does this means the IDs I am providing have no pathways associated? In that case its not correct because entrez ids I am using have pathways associated except few which don't have are listed as NA.

names=c("57801","2152","54873","7412","148867","1435","90874","NA","2702","NA","1520,","3371","7185","26468","286336","NA","22829")

Due to this I am unable to understand why this message is displayed?

Looking forward for some help.

R • 40k views

ADD COMMENT • link updated 14 months ago by 13554221497 • 0 • written 6.4 years ago by ARich ▴ 130

2

Entering edit mode

Are you sure those entrez ID's are for human genes? I checked a few and none seemed to be human proteins.

ADD REPLY • link 6.4 years ago by GenoMax 144k

1

Entering edit mode

I am quite sure. The reason is because I cant ever find these IDs in my file. --> No gene can be mapped.... --> Expected input gene ID: 7364,127,574537,5538,4351,221 --> return NULL... I used standard conversion from of Symbols to Entrez ids. res$symbol = mapIds(org.Hs.eg.db, keys=row.names(res), column="SYMBOL", keytype="ENSEMBL", multiVals="first")

res$entrez = mapIds(org.Hs.eg.db, keys=row.names(res), column="ENTREZID", keytype="ENSEMBL", multiVals="first"

res$name = mapIds(org.Hs.eg.db, keys=row.names(res), column="GENENAME", keytype="ENSEMBL", multiVals="first")

Which one you checked? the one where its No gene can be mapped?

Thanks

ADD REPLY • link 6.3 years ago by ARich ▴ 130

0

Entering edit mode

https://www.ncbi.nlm.nih.gov/protein/7364
https://www.ncbi.nlm.nih.gov/protein/5538

Other ID's in the list above seem to be nucleotide ID's.

ADD REPLY • link 6.3 years ago by GenoMax 144k

0

Entering edit mode

These genes are from database, not your input gene list.

> gene <- c("7364", "127", "574537", "5538", "4351", "221")
> bitr(gene, "ENTREZID", "SYMBOL", OrgDb = org.Hs.eg.db)
'select()' returned 1:1 mapping between keys and columns
  ENTREZID  SYMBOL
1     7364  UGT2B7
2      127    ADH4
3   574537  UGT2A2
4     5538    PPT1
5     4351     MPI
6      221 ALDH3B1

ADD REPLY • link 4.4 years ago by wm ▴ 560

0

Entering edit mode

Hello,

I am also using same function enrichKEGG. Can you please explain how problem was resolved for error

>kegg_enrich <-enrichKEGG(gene = names(log_counts, org = 'uma')
"No gene can be mapped....
--> Expected input gene ID: UMAG_02115,UMAG_00118,UMAG_03692,UMAG_11744,UMAG_02508,UMAG_06105
--> return NULL

ADD REPLY • link updated 5.9 years ago by GenoMax 144k • written 5.9 years ago by sbb ▴ 10

0

Entering edit mode

I'm also getting the same error when trying to use the enrichMKEGG function. The same list of ENTREZ IDs work for enrichKEGG, but not enrichMKEGG. I've tried changing the limits of GSSize but still no solution.

> head(geneset_d1$ENTREZID)
[1] "12842"  "13032"  "20715"  "17329"  "12309"  "140474"


   > oraKEGG1 <- enrichKEGG(gene = geneset_d1$ENTREZID, 
+                         organism = "mmu",
+                         pvalueCutoff = 0.05,
+                         qvalueCutoff = 0.25,
+                         pAdjustMethod = "BH",
+                         universe = NULL,
+                         minGSSize = 10,
+                         maxGSSize = 500)
> dim(oraKEGG1)
[1] 19  9



  > ora_mKEGG1 <- enrichMKEGG(gene = geneset_d1$ENTREZID, 
+                         organism = "mmu",
+                         pvalueCutoff = 0.05,
+                         qvalueCutoff = 0.25,
+                         pAdjustMethod = "BH",
+                         universe = NULL,
+                         minGSSize = 1,
+                         maxGSSize = 2000)
--> No gene can be mapped....
--> Expected input gene ID: 14433,67834,14751,14380,230163,17448
--> return NULL...

Can anyone help with this?

ADD REPLY • link 4.4 years ago by elizabeth.jennings • 0

0

Entering edit mode

Because None of these genes are included in KEGG module database. You can check the gene_list using bitr_kegg :

> gene <- c("12842", "13032", "20715", "17329", "12309", "140474")
> bitr_kegg(gene, fromType = "kegg", toType = "Module", organism = "mmu")
Reading KEGG annotation online:

trying URL 'http://rest.kegg.jp/link/mmu/module'
downloaded 29 KB

Reading KEGG annotation online:

trying URL 'http://rest.kegg.jp/list/module'
downloaded 24 KB

[1] kegg   Module
<0 rows> (or 0-length row.names)
Warning message:
In bitr_kegg(gene, fromType = "kegg", toType = "Module", organism = "mmu") :
  100% of input gene IDs are fail to map...

ADD REPLY • link 4.4 years ago by wm ▴ 560

0

Entering edit mode

Thanks your your reply! I checked my genelists and for all of them 100% of input genes failed to map. Would this be expected or is there potentially a problem with the gene ids that I have? For example, one genelist has 166 genes - how likely is it that none of these were in the KEGG module database? Many thanks!

ADD REPLY • link 4.4 years ago by elizabeth.jennings • 0

0

Entering edit mode

It could be none of your 166 genes were in KEGG module database (a manual curated database). you can try KEGG Pathway, or other databases.

ADD REPLY • link 4.4 years ago by wm ▴ 560

0

Entering edit mode

I am having the same problem too... The code was previously working fine. Have you guys sorted it out?

ADD REPLY • link 16 months ago by pkwong96 • 0

0

Entering edit mode

use_internal_data=TRUE, add the params maybe can solve the problem.

KEGG analysis need reading KEGG annotation online, so if your network is bad, it maybe failed.

ADD REPLY • link 14 months ago by 13554221497 • 0

0

Entering edit mode

What versions of BioConductor/clusterProfiler are you using? I had problems before updating to Bioconductor 3.16 and clusterProfiler 4.6.2, then things worked for me.

ADD REPLY • link 16 months ago by james • 0

1

Entering edit mode

4.4 years ago

wm ▴ 560

"No gene can be mapped...."

it is because, None of the gene input was included in the database (KEGG Pathway or KEGG module).

We can check the gene before enrichment analysis.

> bitr_kegg(gene, fromType = "kegg", toType = "Path", organism = "hsa")
> bitr_kegg(gene, fromType = "kegg", toType = "Module", organism = "hsa")

source code for the function: DOSE::enricher_internal()

## file: DOSE/R/enricher_internal.R
## line30-44


## query external ID to Term ID
gene <- as.character(unique(gene))
qExtID2TermID <- EXTID2TERMID(gene, USER_DATA)
qTermID <- unlist(qExtID2TermID)
if (is.null(qTermID)) {
    message("--> No gene can be mapped....")

    p2e <- get("PATHID2EXTID", envir=USER_DATA)
    sg <- unlist(p2e[1:10])
    sg <- sample(sg, min(length(sg), 6))
    message("--> Expected input gene ID: ", paste0(sg, collapse=','))

    message("--> return NULL...")
    return(NULL)
}

ADD COMMENT • link 4.4 years ago by wm ▴ 560

0

Entering edit mode

5.6 years ago

wangchangliang0209 • 0

Hi sbbinfo，

I found that there is something wrong with the help document of enrichKEGG. The "gene" parameter is not Entrez gene ID for other organisms (not "hsa"). For "uma", you should input a vector in "UMAG_02115, UMAG_00118" format, not Entrez gene ID vector.

Best wishes.

ADD COMMENT • link 5.6 years ago by wangchangliang0209 • 0

0

Entering edit mode

I know that it could be late. But I am trying to do the analysis for zebrafish. As you mentioned, enrichKEGG function does not work when I used Entrez gene ID. Where did you find this format? I would like to know the input for zebrafish. I just tryied "dre_102725537,dre_795613,dre_393541,dre_100000710,dre_325037,dre_558156" and "DRE_102725537,DRE_795613,DRE_393541,DRE_100000710,DRE_325037,DRE_558156". Both did not work. May someone help me?

Thank you!

ADD REPLY • link 17 months ago by PBC ▴ 10

0

Entering edit mode

I'm having the same issue here.. The weirdest thing is that my code ran perfectly okay two weeks ago - but now it just does not work anymore.. I wonder if there was an update for the package or anything...

ADD REPLY • link 16 months ago by Ning • 0

0

Entering edit mode

That's so weird. I got the same issue too and don't know how to solve it, maybe I should change the database...

ADD REPLY • link 16 months ago by yingzeng13 • 0

0

Entering edit mode

I have having the same issue too. Wonder if you have figured it out?

ADD REPLY • link 16 months ago by pkwong96 • 0

0

Entering edit mode

I am facing the same problem. It used to work a little while ago for running KEGG enrichment on human Entrez IDs. Now, it does not recognize the input Entrez IDs.

kk <- enrichKEGG(gene = gene, 
                       organism = "hsa", 
                       keyType = "kegg", 
                       pvalueCutoff = 0.05, 
                       pAdjustMethod = "BH", 
                       universe = annotatedData$gene_id, 
                       minGSSize = 10, 
                       maxGSSize = 500, 
                       qvalueCutoff = 0.2, 
                       use_internal_data = FALSE )

The error message:

--> No gene can be mapped....
--> Expected input gene ID:
--> return NULL...

clusterProfiler v4.2.2

R v4.1.2

I also tried running it on all genes (gene = annotatedData$gene_id). Still no IDs could have been mapped.

These function work as expected though, using the same Entrez IDs as input:

enrichMKEGG(gene = gene,
                 organism = 'hsa',
                 pvalueCutoff = 1,
                 qvalueCutoff = 1)

enrichWP(gene, 
organism = "Homo sapiens")

Does anyone have any hints?

ADD REPLY • link 16 months ago by joe555 • 0

0

Entering edit mode

Since this is working with KEGG perhaps they may have stopped providing access to this package.

Would you mind letting the author know by using "contact by email" link on the project page: https://guangchuangyu.github.io/software/clusterProfiler/

ADD REPLY • link 16 months ago by GenoMax 144k

0

Entering edit mode

You were right. The author kindly replied and I also found this: https://github.com/YuLab-SMU/clusterProfiler/issues/561

The latest github version of clusterProfiler is supposed to work.

remotes::install_github("YuLab-SMU/clusterProfiler")

version 4.7.1.3

The DOSE package has to be updated, too.

This worked for me a couple of days ago but it stopped working again. So this issue may not be completely resolved.

ADD REPLY • link 16 months ago by joe555 • 0

0

Entering edit mode

15 months ago

coggy • 0

Just FYI, I was also having the same trouble in KEGG pathway analysis by clusterProfiler. I updated BiocManager and clusterProfiler (4.2.2 -> 4.6.2), then it worked. It seems that the previous version of clusterProfiler doesn't function at this time.

ADD COMMENT • link 15 months ago by coggy • 0

0

Entering edit mode

14 months ago

13554221497 • 0

use_internal_data=TRUE

the kegg analysis need read KEGG annotation online, so if your network is bad, the analysis maybe failed;

i run the enrichKEGG without 'use_internal_data=TRUE' in windows, it can work; while it failed on linux cluster and when i add 'use_internal_data=TRUE' , it worked.

ADD COMMENT • link 14 months ago by 13554221497 • 0

score 0 · Accepted Answer · 2018-03-23

0

Entering edit mode

6.3 years ago

ARich ▴ 130

I got the answer to my problem. I was using some cutoff. After the cutoff few comparison had no genes left for down-regulated list This was kind of warning message more that a error.

ADD COMMENT • link 6.3 years ago by ARich ▴ 130