Question: Getting Genome Coordinates From Refseq Exon Mrna Position Data?
gravatar for Krisr
9.9 years ago by
United States
Krisr460 wrote:

I am using bioperl to obtain exon coordinates for a variety of mRNAs.... For example:

use strict;
use Bio::DB::GenBank;
use Data::Dumper;
use Bio::SeqIO;

my @exons;
my $seq;
my $a = Bio::DB::GenBank->new;
my $seq = $a->get_Seq_by_acc('NM_005378');

# Dump Data

for my $feat($seq->get_SeqFeatures) {
  if($feat->primary_tag eq 'exon') {
    push(@exons, $feat->location);

I would now like to use Bioperl to obtain the corresponding genomic DNA positions from reference assembly. I am ONLY interested in the corresponding gDNA positions for each reported exon. Does anyone know of a function that could provide this?

ADD COMMENTlink modified 9.7 years ago by Reece270 • written 9.9 years ago by Krisr460
gravatar for Pierre Lindenbaum
9.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum131k wrote:

Not using bioperl , but just mysql. The genes from refSeq have been mapped by the UCSC:

> mysql  -h -A -u genome -D hg18 -e 'select * from refGene where name="NM_005378"\G'
*************************** 1. row ***************************
         bin: 707
        name: NM_005378
       chrom: chr2
      strand: +
     txStart: 15998133
       txEnd: 16004580
    cdsStart: 15999637
      cdsEnd: 16003670
   exonCount: 3
  exonStarts: 15998133,15999520,16003065,
    exonEnds: 15998316,16000427,16004580,
          id: 0
       name2: MYCN
cdsStartStat: cmpl
  cdsEndStat: cmpl
  exonFrames: -1,0,1,

The table is available for download at:

ADD COMMENTlink modified 13 months ago by RamRS30k • written 9.9 years ago by Pierre Lindenbaum131k

Thanks Pierre, this really helped me.

ADD REPLYlink written 9.1 years ago by A.L0
gravatar for Reece
9.4 years ago by
United States
Reece270 wrote:

I needed something similar. The only way I worked out was to use NCBI Eutilities to search by NM accession for an id, and then use that id to fetch a "full" record from nuccore as xml. I had to reverse engineer the XML format.

The code is here.

And it works something like this:

apt12j$ ~/projects/bio-hgvs-perl/sandbox/ncbi-tx-exons NM_023035.2
NCBI (NM_023035.2; 1 transcripts)

This script was just a sketch to see how to do it. Perhaps it'll help you get started.

Also see for a discussion on this topic.


ADD COMMENTlink modified 13 months ago by RamRS30k • written 9.4 years ago by Reece270

please, ask a new question.

ADD REPLYlink written 9.4 years ago by Pierre Lindenbaum131k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1050 users visited in the last hour