I am trying to get gene symbols for gene ids that I got for mouse datasets. Gene ids look like that:
0610009B22Rik. The code that I am trying to utilize is the following one:
ensembl <- useMart("ensembl", dataset="mmusculus_gene_ensembl") mouse_gene_ids <- dataset[, 1] foo <- getBM(attributes=c('ensembl_gene_id', 'external_gene_name'), filters = 'genedb', values = mouse_gene_ids, mart = ensembl)
I am getting
zero results as an output after the query runs. I guess
filters parameter is wrong. Any suggestions would be greatly appreciated.
Why can't I get any gene symbol
Perhaps you want to try the
Yes. I tried it and it works.
Thank you so much for the help!
Hello, I tried to follow the previous posts and actually everything worked but I did not get anything back as result. My code below:
library(biomaRt) ensembl <- useMart("ensembl",dataset="mmusculus_gene_ensembl") genes_ids <- c('ENSMUSG00000051951.5', 'ENSMUSG00000025900.12', 'ENSMUSG00000025902.13') gs_heatdata <- getBM(attributes = c("external_gene_name"), filters = "mgi_symbol", values = genes_ids, mart = ensembl)
Hi, you need to remove the trailing numbers from the gene IDs. Also, the value for
filtersshould be ensembl_gene_id. Please try this:
it works perfectly but I did not understand how you managed it: - the trailing number stands for the 0s before the actual id? - could you explain me in particular what
sub('\\.[0-9]*$', '',refers to? thank you a lot!
That is a regular expression saying that
substitute anything including a period and any number(s) between 1 and 9 with nothing (i.e. delete).
sorry I forgot one more question. How can I make the code "cleaner"? because the output in the end shows me two features that are the same, the 'external_gene_name' and 'mgi_symbol'.
Change following line
mgi_symbolif you want to keep that instead.
I tried with my all dataset but it did not work. I just have in return the empty table with the external_gene_name and ensembl_gene_id as headers.
Hi, the converted IDs are contained in
gs_heatdata. You then have to align these to the rownames of
heatdata, and then replace them with the external gene IDs (MGI symbols).
Hi, how can I align them? which function should I use? how can I then replace them with the external gene IDs? should I first convert the
heatdatain the first column and then somehow combine the df
gs_heatdatawith the df
heatdata? thank you a lot! :)
Hi, please take a look at functions such as
match(), and other functions from dplyr (package) for matching data-frames.
A quick example:
Hi, I tried for now with
match()but I think it did not work.
match()returns the indices [in
heatdata] of the elements of
What you likely need is:
ok, I try this. Just for me to understand: can I also just use the previous
genes_idsor I have to put the entire
sub('\\.[0-9]*$', '', rownames(heatdata))in
all()? thank you!!
It returned this:
I think I found a problem and it was quite in front of me. the filters set were wrong. I had to use
filters = "ensembl_gene_id"instead of
filters = "mgi_symbol". now the
but if I proceed with the previous code I get anyway