Question: annotation - biomaRt - getBM - multiple entrez ID
0
gravatar for Learner
7 months ago by
Learner 160
Learner 160 wrote:

Hi

I am trying to follow what is said in here , however, without any success

https://support.bioconductor.org/p/52407/

when I do it as described here, I am fine

ex <- c("ENSG00000215417", "ENSG00000224078", "ENSG00000198366",
"ENSG00000196176", "ENSG00000166012", "ENSG00000158406",
"ENSG00000196787")
mart <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
gene2genomeEx <- getBM(values = ex, filters = "ensembl_gene_id", mart
= mart, attributes = c("ensembl_gene_id", "entrezgene","hgnc_symbol",
"external_gene_id", "external_gene_db", "description",
"chromosome_name", "strand"))

however, when I try to use gene name, it crashes and gets me error

ex <- c("ACTN4","TUBA1B","ACTN1","TP53")
mart <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
    gene2genomeEx <- getBM(values = ex, filters = "external_gene_name", mart
    = mart, attributes = c("external_gene_name", "entrezgene","hgnc_symbol",
    "external_gene_id", "external_gene_db", "description",
    "chromosome_name", "strand"))

any thought ?

entrez biomart genome • 694 views
ADD COMMENTlink modified 7 months ago by Emily_Ensembl18k • written 7 months ago by Learner 160
1
gravatar for Kevin Blighe
7 months ago by
Kevin Blighe45k
Kevin Blighe45k wrote:

The problem is that there are no attributes called external_gene_id or external_gene_db.

Take a look:

require(biomaRt)

Look-up ENSEMBL gene IDs:

ex <- c("ENSG00000215417", "ENSG00000224078", "ENSG00000198366",
"ENSG00000196176", "ENSG00000166012", "ENSG00000158406",
"ENSG00000196787")
mart <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
gene2genomeEx <- getBM(values = ex,
  filters = "ensembl_gene_id",
  mart = mart,
  attributes = c("ensembl_gene_id", "entrezgene_id",
    "hgnc_symbol", "external_gene_name",
    "description", "chromosome_name",
    "strand"))
gene2genomeEx

  ensembl_gene_id entrezgene hgnc_symbol external_gene_name
1 ENSG00000158406       8365    HIST1H4H           HIST1H4H
2 ENSG00000166012      79101       TAF1D              TAF1D
3 ENSG00000196787       8969   HIST1H2AG          HIST1H2AG
4 ENSG00000215417     407975     MIR17HG            MIR17HG
5 ENSG00000224078       3653      SNHG14             SNHG14

Look-up 'external' gene names:

ex <- c("ACTN4","TUBA1B","ACTN1","TP53")
mart <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
gene2genomeEx <- getBM(values = ex,
  filters = "external_gene_name",
  mart = mart,
  attributes = c("external_gene_name", "entrezgene_id",
    "hgnc_symbol", "description",
    "chromosome_name", "strand"))
gene2genomeEx

  external_gene_name entrezgene hgnc_symbol
1               TP53       7157        TP53
2              ACTN4         81       ACTN4
3             TUBA1B      10376      TUBA1B
4              ACTN1         87       ACTN1
5              ACTN4         81       ACTN4
                                            description chromosome_name strand
1 tumor protein p53 [Source:HGNC Symbol;Acc:HGNC:11998]              17     -1
2     actinin alpha 4 [Source:HGNC Symbol;Acc:HGNC:166]              19      1
3  tubulin alpha 1b [Source:HGNC Symbol;Acc:HGNC:18809]              12     -1
4     actinin alpha 1 [Source:HGNC Symbol;Acc:HGNC:163]              14     -1
5     actinin alpha 4 [Source:HGNC Symbol;Acc:HGNC:166]  CHR_HG26_PATCH      1

Kevin

ADD COMMENTlink modified 5 days ago • written 7 months ago by Kevin Blighe45k

@Kevin Blighe Kevin, how did you check for it? I checked the attributes but with no success. is there a way to unique them based on GO id?

ADD REPLYlink written 7 months ago by Learner 160

This is the first time that you mention GO id. What do you mean? Please try to explain in detail the issue that faces you.

ADD REPLYlink written 7 months ago by Kevin Blighe45k

@Kevin Blighe instead the gene_name, please now try this one, you will see that you get error

ex <- c("GO:0000002","GO:0042254","GO:0042254","GO:0000022","GO:0000028","GO:0000028","GO:0000045")
ADD REPLYlink modified 7 months ago • written 7 months ago by Learner 160
2

If you're trying to run exactly the same query, but providing a list of GO ids then it expected you will get an error (or at least no results). The filters argument defines the column in the database you want to search. If you're asking to search a column containing gene IDs or symbols, but looking for GO ids (which are completely different from gene names) then you wont find any results.

You'll need to include the full code that produced the error, otherwise it's very hard to see what exactly you're trying to do.

ADD REPLYlink written 7 months ago by Mike Smith1.2k

@Mike Smith I get this error but I don't know the argument of 'listFilters'

Error in getBM(values = ex, filters = "go_id", mart = mart, attributes = c("go_id",  : 
  Invalid filters(s): go_id 
Please use the function 'listFilters' to get valid filter names
ADD REPLYlink modified 7 months ago by RamRS22k • written 7 months ago by Learner 160
1

To filter by GO IDs you have to use the filter for GO IDs. If you're just looking to get genes associated with those exact terms, use go, if you want to get the genes associated with all their child terms too, use go_parent_term.

ADD REPLYlink written 7 months ago by Emily_Ensembl18k

Hi Kevin, I've tried to copy paste exactly what you did

ex <- c("ENSG00000215417", "ENSG00000224078", "ENSG00000198366",
"ENSG00000196176", "ENSG00000166012", "ENSG00000158406",
"ENSG00000196787")
mart <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
gene2genomeEx <- getBM(values = ex,
  filters = "ensembl_gene_id",
  mart = mart,
  attributes = c("ensembl_gene_id", "entrezgene",
    "hgnc_symbol", "external_gene_name",
    "description", "chromosome_name",
    "strand"))
gene2genomeEx

But it does't like the attrubute entrezgene:

Error in getBM(values = ex, filters = "ensembl_gene_id", mart = mart,  : 
  Invalid attribute(s): entrezgene 
Please use the function 'listAttributes' to get valid attribute names

Do you have a suggestion of why?

thank you

ADD REPLYlink written 6 days ago by Morris_Chair120
1

Try using the attribute entrezgene_id instead. Ensembl changed this in release 97, which only just came out (http://ftp.ensembl.org/pub/release-97/release_97_biomart_changes.txt ).

ADD REPLYlink modified 6 days ago • written 6 days ago by Mike Smith1.2k
1

Thanks Mike. I noticed this change myself and will update old code where needed, including my answer here (above).

ADD REPLYlink written 5 days ago by Kevin Blighe45k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1471 users visited in the last hour