Hi! I'm trying to get the location (chromosome and band) of a list of Entrez Gene IDs I got using the Homo.sapiens Bioconductor package:
indx <- findOverlaps(genes(TxDb.Hsapiens.UCSC.hg19.knownGene), mycoords.gr)
Since my original data mycoords.gr) are mapped to the GRCh37/hg19 genome version, I tried using Biomart to get the locations using that version of the genome:
ensembl <-useMart(biomart="ENSEMBL_MART_ENSEMBL", host="grch37.ensembl.org", path="/biomart/martservice", dataset="hsapiens_gene_ensembl") my.symbols <- indx$gene_id my.regions <- getBM(c("entrezgene","hgnc_symbol", "chromosome_name", "band"), filters = "entrezgene", values = my.symbols, mart = ensembl)
I noticed, however, that some of the Entrez IDs that were on my list were not on "my.regions". When I tried using the current version of the genome, those IDs were present but others were missing...
Is there a difference in Entrez IDs between assemblies? I also tried retrieving all of the Entrez IDs in ensembl and some of them were also missing...
mapping <- getBM(attributes = c("entrezgene", "hgnc_symbol"), mart = ensembl)
I don't understand this... Is there an alternative to this method?
Thanks in advance!