I am trying to get gene symbols for gene ids that I got for mouse datasets. Gene ids look like that: 0610009B22Rik. The code that I am trying to utilize is the following one:
ensembl <- useMart("ensembl", dataset="mmusculus_gene_ensembl")
mouse_gene_ids <- dataset[, 1]
foo <- getBM(attributes=c('ensembl_gene_id',
'external_gene_name'),
filters = 'genedb',
values = mouse_gene_ids,
mart = ensembl)
I am getting zero results as an output after the query runs. I guess filters parameter is wrong. Any suggestions would be greatly appreciated.
output:
Why can't I get any gene symbol
Perhaps you want to try the
entrezgene_idfilter instead?Yes. I tried it and it works.
Thank you so much for the help!
Hello, I tried to follow the previous posts and actually everything worked but I did not get anything back as result. My code below:
library(biomaRt) ensembl <- useMart("ensembl",dataset="mmusculus_gene_ensembl") genes_ids <- c('ENSMUSG00000051951.5', 'ENSMUSG00000025900.12', 'ENSMUSG00000025902.13') gs_heatdata <- getBM(attributes = c("external_gene_name"), filters = "mgi_symbol", values = genes_ids, mart = ensembl)Hi, you need to remove the trailing numbers from the gene IDs. Also, the value for
filtersshould be ensembl_gene_id. Please try this:it works perfectly but I did not understand how you managed it: - the trailing number stands for the 0s before the actual id? - could you explain me in particular what
sub('\\.[0-9]*$', '',refers to? thank you a lot!That is a regular expression saying that
substitute anything including a period and any number(s) between 1 and 9 with nothing (i.e. delete).sorry I forgot one more question. How can I make the code "cleaner"? because the output in the end shows me two features that are the same, the 'external_gene_name' and 'mgi_symbol'.
Thank you!
Change following line
to
Or keep
mgi_symbolif you want to keep that instead.I tried with my all dataset but it did not work. I just have in return the empty table with the external_gene_name and ensembl_gene_id as headers.
Hi, the converted IDs are contained in
gs_heatdata. You then have to align these to the rownames ofheatdata, and then replace them with the external gene IDs (MGI symbols).Hi, how can I align them? which function should I use? how can I then replace them with the external gene IDs? should I first convert the
row.namesofheatdatain the first column and then somehow combine the dfgs_heatdatawith the dfheatdata? thank you a lot! :)Hi, please take a look at functions such as
which()andmatch(), and other functions from dplyr (package) for matching data-frames.A quick example:
Hi, I tried for now with
match()but I think it did not work.match()returns the indices [inheatdata] of the elements ofgs_heatdataWhat you likely need is:
ok, I try this. Just for me to understand: can I also just use the previous
genes_idsor I have to put the entiresub('\\.[0-9]*$', '', rownames(heatdata))inmatch()and afterall()? thank you!!It returned this:
I think I found a problem and it was quite in front of me. the filters set were wrong. I had to use
filters = "ensembl_gene_id"instead offilters = "mgi_symbol". now thegs_heatdatalooks good:but if I proceed with the previous code I get anyway
NA: