Parsing A Multi-Record Genbank File
1
0
Entering edit mode
10.5 years ago
robjohn7000 ▴ 110

Hi,

I tried parsing a multi-record genbank file (from this site: http://biopython.org/DIST/docs/tutorial/examples/ls_orchid.gbk) using the code below.

The code returned an error:

readline() on unopened filehandle at parser.pl line 62.

The code:

#!/usr/local/bin/perl -w

use strict;

my $record;

print "Please type in the name of a file\n";
my $file = <STDIN>;
chomp $file;


while( $record = get_next_record($file) ) {

       my ($annotation, $seq) = get_dna ($record);
       open my $fh, '>', 'oufile.txt', or die "cant't open outfile:$!";
       print "Sequence:\n\n", $seq, "\n";

 close $fh or die "cant't open outfile:$!";

 }

 sub get_dna {
      my ($file) = @_;
      my @annotation = ();
      my $seq = '';
      my $in_sequence = 0;
      open FILE, $file or die "Can't open $file, Perl says $!\n";
      while ( my $line = <FILE> )    {
          last if $line =~ m|^//\n|; # stop if line has just // on it
          $in_sequence = 1 if $line =~ m/^ORIGIN/;
               if ($in_sequence) {
                   $seq .= $line;
               } else {
             push @annotation, $line;
              }
       }
       close FILE;
# remove all spaces and digits from $seq. add \n to remove newlines
$seq =~ s/[\s0-9]//g;
# return @annotation array as a scalar reference and scalar $seq
return \@annotation, $seq;
}

sub get_next_record {

    my($file) = @_;
    my($offset);
    my($record) = '';
    my($save_input_separator) = $/;
    $/ = "//\n";
    $record = <$file>;
    $/ = $save_input_separator;
    return $record;
}

Is anyone able to to figure out why I got the error?

Thank you

bioperl biopython perl r python • 4.0k views
ADD COMMENT
3
Entering edit mode
10.5 years ago
Neilfws 49k

I get:

readline() on unopened filehandle at parser.pl line 51

Line 51 is:

$record = <$file>;

So you're trying to open $file, but $file isn't a filehandle.

Your code is rather unnecessarily convoluted and complex, so it's hard to tell what you want to achieve. I get output if I change the line to:

$record = $file;

Is that what you intended? Take a look at the Bioperl Seq::IO HOWTO and Feature Annotation HOWTO for example code.

ADD COMMENT
0
Entering edit mode

Thanks Neilfws. After modifying the code as you suggested($record = $file;), the program outputs the first record sequence and hangs.

ADD REPLY
0
Entering edit mode

Still not clear what you want to achieve. What do you want to see as output?

ADD REPLY
0
Entering edit mode

I want to see all the sequences from all the records in the multi-record genbank file. So far, I can only get the sequence from the first record, and then it hangs.

ADD REPLY
0
Entering edit mode

See the Bioperl links that I posted. Bioperl is your friend; no need to reinvent the wheel.

ADD REPLY
0
Entering edit mode

Thanks again Neilfws

ADD REPLY

Login before adding your answer.

Traffic: 1645 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6