3.4 years ago by
Atlanta, GA
Here is the code that actually works after I did some digging. It's a little bit of a hack but I don't believe it will cause any problems. Also I changed it to Bio::FeatureIO
which I think is a little more mainstream. In this one-liner, ###
is printed to the output file after each input feature, but you can change it to however you want.
perl -MBio::FeatureIO -e '
$in=Bio::FeatureIO->new(-file=>"in.gff");
$out=Bio::FeatureIO->new(-file=>">tmp.gff");
$fh=$out->{"_filehandle"};
while($feat=$in->next_feature){
$out->write_feature($feat);
print $fh "###\n";
}
'
The method fh()
actually performs some tie
and Symbol::gensym
magic which I found in the source code. This magic forces you to use a feature
object when you write to the file.
sub fh {
my $self = shift;
my $class = ref($self) || $self;
my $s = Symbol::gensym;
tie $$s,$class,$self;
return $s;
}
To get around that, I found that the filehandle is actually embedded in the object's variables, under _filehandle
. Therefore you can exploit $out->{"_filehandle"}
by printing to it directly.
•
link
modified 3.4 years ago
•
written
3.4 years ago by
Lee Katz • 3.0k
Could you write an example of what you'd like to see in the GFF3 file?
Something like:
It seems Bio::Tools::GFF can only output features. The solution is to output the non-feature lines yourself. If you want to use forward reference closing (###), you could output relevant features with Bio::Tools::GFF then output ### then output the next batch of features and so on. If that's not convenient, you could output all features then insert the non-feature lines. You would need to keep track of where they should go in the file. Also you could do this in memory before actually printing everything to file.
I wanted to print directly in a file, but to put everything in memory before to print it seems to be the only convenient solution. I'm a bit worry because my files are over 2gb ... so I hope my computer can digest it easily.