Retrieve only protein coding esnsemble gene ids and gene symbols
1
0
Entering edit mode
2.0 years ago
iibrams07 ▴ 10

I tried without success different ways to retrieve the current list of ensemble gene ids including the gene symbol for only protein coding genes by using the R library Biomart. Here is the code:

library(biomaRt)
ensembl = useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
results <- getBM(attributes=c("ensembl_gene_id","gene_biotype"),filters = c("ensembl_gene_id","biotype"), values=list("protein_coding"), mart=ensembl)
results

The error message is:

Error in names(values) <- filters : 
'names' attribute [2] must be the same length as the vector [1]

I also need the gene symbol for each ensemble gene id (example, TSPAN6). I eventually included "hgnc_id" in both the attributes vector as well as the filters one with a similar error message as the one shown above. What should I do to accomplish the task ? Many thanks for any comment.

biomart • 1.3k views
ADD COMMENT
0
Entering edit mode

In attributes you have gene_biotype, while in filters biotype, maybe that has something to do with the error?

ADD REPLY
3
Entering edit mode
2.0 years ago
Basti ★ 2.0k

The error comes from the fact you specified two filters whereas you only provide one value. Additionally, "biotype" is not an attribute of listFilters(ensembl) but I guess "transcript_biotype" is what you would need.

Then to overcome your issue, you can do :

results <- getBM(attributes=c("ensembl_gene_id","hgnc_symbol","transcript_biotype"),filters = c("transcript_biotype"), values=list("protein_coding"), mart=ensembl)
results

If you need further information, have a look at listAttributes(ensembl) and select what you need

ADD COMMENT
0
Entering edit mode

Since most ensembl biomart attributes are either gene_* or transcript_*, I don't think it makes sense to use gene_id with transcript_biotype

ADD REPLY

Login before adding your answer.

Traffic: 2506 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6