How To Retrieve Mrna Split Locations From Genbank Flatfile?
Entering edit mode
9.0 years ago
mluypaert ▴ 10

Hi all,

I got some trouble parsing the genbank flatfile format that ncbi is using for data export. I got a genbank flatfile containing genomic regions with mRNA features in it, which I am parsing with perl (and Bioperl). The mRNA features were retrieve with the get_SeqFeatures() function and I can retrieve all information about each mRNA using the get_all_tags() and the get_tag_values() functions from Bioperl, but I also need the genomic locations for each exon in the mRNA. For that I need to find the genomic location of the gene it belongs to (which don't seem to be in the flatfiles I downloaded) but more importantly, I need to be able to get the split locations for each exon in the mRNA from the mRNA line like:

 mRNA            complement(join(4468..4717,4801..4940,6511..6767,

How can I retrieve this bit of information (from the SeqFeature object I am using in BioPerl)?

genbank ncbi parsing perl bioperl • 2.3k views
Entering edit mode
9.0 years ago
mluypaert ▴ 10

I found the answer myself after some browsing in the Bioperl manuals. The following chunck of perl code solved my problem:

        $location_obj = $feat_object->location();

        # retrieve split location

        my $location_ref = ref($location_obj);
        if($location_ref eq 'Bio::Location::Simple'){
            $sub_locations[0] = $location_obj;
        }elsif($location_ref eq 'Bio::Location::Split'){
            @sub_locations = $location_obj->sub_Location();

I made a Genomic Region For Ncbi Transcript(/Gene) Accessions for retrieving the genomic location instead of the contig locations (which are retrieved directly from the genbank flatfiles in this case).


Login before adding your answer.

Traffic: 2769 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6