Question: Convert Ensembl Transcript Ids Ensmust To Gene Symbol In R
2
4
Entering edit mode
3.7 years ago
sugus ▴ 150

I have a list of ensemble ID and I would like to convert them to gene symbol name. How am I supposed to do this in R?

[1] "ENSMUST00000110582" "ENSMUST00000110585" "ENSMUST00000110586" "ENSMUST00000135559"
 [5] "ENSMUST00000166082" "ENSMUST00000135552" "ENSMUST00000207020" "ENSMUST00000192677"
 [9] "ENSMUST00000090457" "ENSMUST00000207029"

I have tried using GPL20775-85955 annotation but only a few could be matched

Ginfo <- fread(file.path(data.path,"GPL20775-85955.txt"),sep = "\t",header = T,check.names = F,stringsAsFactors = F,data.table = F) 
Ginfo$ENSEMBLE <- sapply(strsplit(Ginfo$mrna_assignment, " // ", fixed = T), "[", 1)
Ginfo <- Ginfo[!duplicated(Ginfo$ENSEMBLE),]
rownames(Ginfo) <- Ginfo$ENSEMBLE
Ginfo$mrna <- sapply(strsplit(Ginfo$gene_assignment, " // ", fixed = T), "[", 2)

I think there may be a easier way to use bioMart or something but I do not know how.

R ENSEMBL gene • 12k views
ADD COMMENT
3
Entering edit mode
3.7 years ago
sugus ▴ 150

I solved this.

mart <- useMart("ensembl","mmusculus_gene_ensembl")##人类选择hsapiens_gene_ensembl
ensemble2gene <- getBM(attributes=c("ensembl_transcript_id","external_gene_name","ensembl_gene_id"),
                       filters = "ensembl_transcript_id",
                       values = rownames(countsTable), 
                       mart = mart)
rownames(ensemble2gene) <- ensemble2gene$ensembl_transcript_id
ADD COMMENT
0
Entering edit mode

Awww, you beat me

ADD REPLY
1
Entering edit mode
3.7 years ago

Hey friend, all good?

You can try org.Mm.eg.db and biomaRt. Let's see which one can map these:

lookup <- c('ENSMUST00000110582','ENSMUST00000110585',
  'ENSMUST00000110586','ENSMUST00000135559',
  'ENSMUST00000166082','ENSMUST00000135552',
  'ENSMUST00000207020','ENSMUST00000192677',
  'ENSMUST00000090457','ENSMUST00000207029')

org.Mm.eg.db

require('org.Mm.eg.db')
keytypes(org.Mm.eg.db)
select(
  org.Mm.eg.db,
  keytype = 'ENSEMBLTRANS',
  columns = c('ENSEMBL','ENSEMBLTRANS','ENTREZID','SYMBOL'),
  keys = lookup)

         ENSEMBLTRANS            ENSEMBL ENTREZID        SYMBOL
1  ENSMUST00000110582 ENSMUSG00000079083    77532          Jrkl
2  ENSMUST00000110585               <NA>     <NA>          <NA>
3  ENSMUST00000110586               <NA>     <NA>          <NA>
4  ENSMUST00000135559               <NA>     <NA>          <NA>
5  ENSMUST00000166082               <NA>     <NA>          <NA>
6  ENSMUST00000135552               <NA>     <NA>          <NA>
7  ENSMUST00000207020               <NA>     <NA>          <NA>
8  ENSMUST00000192677               <NA>     <NA>          <NA>
9  ENSMUST00000090457 ENSMUSG00000022543    74684 4930451G09Rik
10 ENSMUST00000207029               <NA>     <NA>          <NA>

biomaRt

library(biomaRt)
mart <- useMart('ensembl', dataset = 'mmusculus_gene_ensembl')

getBM(
  attributes = c(
    'ensembl_gene_id',
    'ensembl_transcript_id', 
    'entrezgene_id', 'mgi_symbol'),
  filters = 'ensembl_transcript_id',
  values = lookup,
  mart = mart)

      ensembl_gene_id ensembl_transcript_id entrezgene_id    mgi_symbol
1  ENSMUSG00000022543    ENSMUST00000090457         74684 4930451G09Rik
2  ENSMUSG00000079083    ENSMUST00000110582         77532          Jrkl
3  ENSMUSG00000054115    ENSMUST00000110585         27401          Skp2
4  ENSMUSG00000060227    ENSMUST00000110586        319996         Golm2
5  ENSMUSG00000020744    ENSMUST00000135552         67283      Slc25a19
6  ENSMUSG00000038271    ENSMUST00000135559        320678         Iffo1
7  ENSMUSG00000025512    ENSMUST00000166082         68038         Chid1
8  ENSMUSG00000034755    ENSMUST00000192677        245578       Pcdh11x
9  ENSMUSG00000058966    ENSMUST00000207020         68952        Tlcd3b
10 ENSMUSG00000109498    ENSMUST00000207029            NA       Gm45222

Kevin

ADD COMMENT
0
Entering edit mode

Dear Kevin, while running the following comments in R studio

lookup <- c('ENSMUST00000110582','ENSMUST00000110585','ENSMUST00000110586',

  • 'ENSMUST00000135559','ENSMUST00000166082','ENSMUST00000135552',
  • 'ENSMUST00000207020','ENSMUST00000192677','ENSMUST00000090457','ENSMUST00000207029') library(biomaRt) mart <- useMart('ensembl', dataset = 'mmusculus_gene_ensembl')

I am facing the this issue

Ensembl site unresponsive, trying uswest mirror Error in curl::curl_fetch_memory(url, handle = handle) : SSL certificate problem: unable to get local issuer certificate

So, How I am supposed to come over this problem.

ADD REPLY
1
Entering edit mode

Hi, this problem is related [I think] to 'overload' on Ensembl's side, and/or it could be that one or more greedy users are abusing their connection with Ensembl. You will just have to keep re-trying. Not much that I can do - sorry.

ADD REPLY
1
Entering edit mode

That's fine.Thank you very much Mr. Kevin

ADD REPLY
0
Entering edit mode

yes, Mr. Kevin, you are right There was a network issue. Today I got results.

 ensembl_gene_id ensembl_transcript_id entrezgene_id    mgi_symbol
1  ENSMUSG00000022543    ENSMUST00000090457         74684 4930451G09Rik
2  ENSMUSG00000079083    ENSMUST00000110582         77532          Jrkl
3  ENSMUSG00000054115    ENSMUST00000110585         27401          Skp2
4  ENSMUSG00000060227    ENSMUST00000110586        319996         Golm2
5  ENSMUSG00000020744    ENSMUST00000135552         67283      Slc25a19
6  ENSMUSG00000038271    ENSMUST00000135559        320678         Iffo1
7  ENSMUSG00000025512    ENSMUST00000166082         68038         Chid1
8  ENSMUSG00000034755    ENSMUST00000192677        245578       Pcdh11x
9  ENSMUSG00000058966    ENSMUST00000207020         68952        Tlcd3b
10 ENSMUSG00000109498    ENSMUST00000207029            NA       Gm45222

Thank you very much.

ADD REPLY
1
Entering edit mode

Great - thank you for returning with the extra information

ADD REPLY

Login before adding your answer.

Traffic: 777 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6