I use the following code to retrieve some attributes including 'entrezgene_id'
ensembl <- useMart("ensembl") ensembl = useDataset("hsapiens_gene_ensembl", mart=ensembl) entrez.data <- getBM(attributes=c('ensembl_gene_id','entrezgene_id', 'entrezgene_accession', 'entrezgene_description'),filters = 'ensembl_gene_id', values = result.res$ID, mart = ensembl)
But after some filtering, I found that Biomart finds multiple 'entrezgene_id' for 'ensembl_gene_id'.
ENSG00000111215 5554 PRH1 proline rich protein HaeIII subfamily 1 ENSG00000111215 11272 PRR4 proline rich 4
When I use ensembl website and look for gene "ENSG00000111215", there is only one result but in the result report, there is a line like "PRH1 (NCBI gene (formerly Entrezgene) record", which is what Biomart finds as second 'entrezgene_id'.
I was wondering how to get rid of those "formerly Entrezgene"?