Question: biomart gene coordinates do not correspond to genome browser
3
gravatar for tonja.r
4.0 years ago by
tonja.r450
UK
tonja.r450 wrote:

I was annotating my dataset with biomart with filtering by chromosomal region and was surprised by the genes I got, so I took a closer look on PRAMENP (ENSG00000197549).

According to biomart its positions are:

  chromosome_name start_position end_position strand ensembl_gene_id hgnc_symbol
1 22 21991099 22043934 -1 ENSG00000197549 PRAMENP

 

but if I look at genome browser I get following:

 

GENCODE Transcript Annotation ENST00000337471.4 (PRAMENP)

  Transcript Gene
Gencode id ENST00000337471.4 ENSG00000197549.5
HAVANA manual id OTTHUMT00000320276.2 OTTHUMG00000150836.3
Position chr22:22345497-22398332 chr22:22345497-22398332

 

because of those differences while using biomart I get lots of genes that are far away from my dataset (SNPs) according to genome browser. And those that are really close to them (according to genome browser) do not appear in biomart.
ensembl = useMart(biomart="ENSEMBL_MART_ENSEMBL", host="www.ensembl.org",
                  path="/biomart/martservice", dataset="hsapiens_gene_ensembl")
filterlist = list("22:21815836:22006492")
attributes.1 = c("chromosome_name","start_position", "end_position","strand", "ensembl_gene_id", "hgnc_symbol")
results.1 = getBM(attributes = attributes.1, filters = c("chromosomal_region"), values = filterlist, mart = ensembl)> unique(results.1$hgnc_symbol)
[1] "PRAMENP" "MAPK1"   ""        "TOP3B"   "PPM1F" 

But according to genome browser (coordinates: chr22:21,815,836-22,006,492) I should have got UBE2L3,YDJC, PI4KAP2 and some more but not those identified by biomart.

 

I guess the biomart dataset is build on hg38, and I am viewing hg19 in genome browser. Is it possible to get hsapiens_gene_ensembl in hg19?

ensembl biomart R • 6.1k views
ADD COMMENTlink modified 4.0 years ago by Emily_Ensembl17k • written 4.0 years ago by tonja.r450

Switch genome browser to the older build and see if retrieved sequences are the same, that might confirm your suspicion.

ADD REPLYlink written 4.0 years ago by Biomonika (Noolean)3.0k

I did it already, the result is the biomart dataset is build on hg38 and genome browser is on hg19, but all my data in on hg19, so I want to be consistent. Is there biomart dataset on hg19?

ADD REPLYlink written 4.0 years ago by tonja.r450
11
gravatar for komal.rathi
4.0 years ago by
komal.rathi3.4k
Children's Hospital of Philadelphia, Philadelphia, PA
komal.rathi3.4k wrote:

You can access Ensembl75 (hg19/GRCh37) using:

grch37 = useMart(biomart="ENSEMBL_MART_ENSEMBL", host="grch37.ensembl.org", path="/biomart/martservice", dataset="hsapiens_gene_ensembl")

or 

ensembl_75 = useMart(biomart="ENSEMBL_MART_ENSEMBL", host="feb2014.archive.ensembl.org", path="/biomart/martservice", dataset="hsapiens_gene_ensembl")
ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by komal.rathi3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1278 users visited in the last hour