I am working with human and rat gene sets. From a list of ensembl gene IDs, I want to retrieve columns of attributes via biomaRt. With about 4000 genes, the process runs very slowly (30 minutes). I can save the R object and use it for next times. But is there anyway for me to download a whole package of gene annotation information with gen ontology, RFAM, PFAM, Interpro, etc? In particular, I am interested in downloading the following attitutes.
This is a snippet for what I am trying to do:
library(biomaRt) #Example of 20 gene ids. ensids <- c( 'ENSRNOG00000000001', 'ENSRNOG00000000009', 'ENSRNOG00000000040', 'ENSRNOG00000000055', 'ENSRNOG00000000082', 'ENSRNOG00000000091', 'ENSRNOG00000000129', 'ENSRNOG00000000137', 'ENSRNOG00000000138', 'ENSRNOG00000000142', 'ENSRNOG00000000156', 'ENSRNOG00000000187', 'ENSRNOG00000000196', 'ENSRNOG00000000231', 'ENSRNOG00000000233', 'ENSRNOG00000000239', 'ENSRNOG00000000277', 'ENSRNOG00000000288', 'ENSRNOG00000000307', 'ENSRNOG00000000321') m <- useMart(biomart = "ENSEMBL_MART_ENSEMBL", dataset = "rnorvegicus_gene_ensembl") enstable <- getBM(mart = m, attributes = c('ensembl_gene_id','gene_biotype', 'external_gene_name', 'superfamily', 'family', 'go_id','goslim_goa_accession', 'rfam', 'pirsf','interpro','tigrfam'), filters = c('ensembl_gene_id'), values = ensids)
Even though the first time download may take more time, but I see much greater benefits of subsequent uses: leave the ensembl server unstressed with repeated queries, shorter runtime, and internet independent.