Hello,
I am trying to convert human gene names to mouse gene names using biomaRt
.
Here is the code I am using:
library("biomaRt")
ensembl = useMart("ensembl")
ensembl = useDataset("hsapiens_gene_ensembl",mart=ensembl)
all <- getBM(attributes = c("external_gene_name","mmusculus_homolog_associated_gene_name"), mart=ensembl)
The problem here is, that for some genes that I am interested in, there are two mmusculus_homolog_associated_gene_name
. One contains the right homolog, the other is simply empty. One such gene is PTPRD
.
When I want to perform the actual conversation with this code:
mapping <- getBM(attributes = c("external_gene_name","mmusculus_homolog_associated_gene_name"),
filters = "external_gene_name",
values = GENES-OF-INTEREST,
mart=ensembl)
The entry of PTPRD is picked, where the homolog is empty and not Ptprd
. This happens for many different genes, and messes up my conversation table.
If anyone can help me figure out this problem or tell me why there are multiple external_gene_name
in the first place, I'd be very happy.
EDIT:
Added image to emphasize what I am seeing when analyzing the all
object
NaN
generally stands fornot-a-number
as in "undefined" or "unrepresentable". I don't think there is a gene callednan
. There is one calledNans
.Sorry, its not actually NaN. Its just empty. Ill specify this in my question.
You can find a list of human-mouse gene homologs at this link from MGI/Jax.