Question: extracting the exons coordinates on hg38
gravatar for Bogdan
22 months ago by
Palo Alto, CA, USA
Bogdan850 wrote:

Dear all,

please could you advise : how can we obtain the coordinates of exons of the RefSeq or UCSC genes (canonical isoforms) on hg38, where each coordinates (chr, start, end) also have assigned the gene name .. ?

thanks a lot, and a happy weekend,

-- bogdan

exome genome • 1.6k views
ADD COMMENTlink modified 22 months ago by chen1.9k • written 22 months ago by Bogdan850

Hello Bogdan!

It appears that your post has been cross-posted to another site:

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLYlink written 22 months ago by WouterDeCoster40k

Dear gentlemen, thank you for your replies : very much appreciate your help ;). Yes, i knew the previous postings related to extracting the hg19 exon coordinates before i emailed ; although it applies a bit differently to hg38 and to RefSeq genes.

I thought that we may find 2 solutions to the same question : in BioC (by using GenomeFeatures), and in a not-BioC related manner; that I can compare afterwards.

thanks again for your hep, and happy weekend ;) !

ADD REPLYlink modified 22 months ago • written 22 months ago by Bogdan850

Hi Bogdan

I don't know whether you completed this in the end, but ffor anyone else who is trying to use Hg38 then I would recommend following the blog post linked above for hg19, but use the GENCODEv29 track instead of UCSC genes to download the canonical transcripts and exons file.

Follow the instructions as written but just change the tracks over. The code given in that previous blog should work to. I had no errors.

Hope that helps


ADD REPLYlink written 4 months ago by s16671530
gravatar for chen
22 months ago by
chen1.9k wrote:

Try OpenGene.jl, a library written in Julia (

using OpenGene, OpenGene.Reference

# load the gencode dataset, it will download a file from gencode website if it's not downloaded before
# once it's loaded, it will be cached so future loads will be fast
index = gencode_load("GRCh38")

genes = gencode_genes(index, "TP53")
tp53 = genes[1]
exons = tp53.transcripts[1].exons
#print the exons
for exon in exons:
    println(exon.number, exon.start_pos, exon.end_pos)
ADD COMMENTlink modified 22 months ago • written 22 months ago by chen1.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 846 users visited in the last hour