Question: Bioperl gff3 : How to print non gff3 feature as "###" using Bio::Tools::GFF
0
gravatar for Juke-34
3.4 years ago by
Juke-343.1k
Sweden
Juke-343.1k wrote:

Hi everybody,

It is not the first time I meet the issue. GFF3 format allows to have lines that are not features like ###. But using Bio::Tools::GFF I don't find any method allowing to print that kind of non-feature lines.

Does someone know a way to perform that task ? I'm only interested in Perl solution.

Thank you !

genome gene • 1.2k views
ADD COMMENTlink modified 3.4 years ago by Lee Katz3.0k • written 3.4 years ago by Juke-343.1k

Could you write an example of what you'd like to see in the GFF3 file?

ADD REPLYlink written 3.4 years ago by Lee Katz3.0k

Something like:

ctg123 . gene               1000  9000  .  +  .  ID=gene00001;Name=EDEN
ctg123 . mRNA               1050  9000  .  +  .  ID=mRNA00001;Parent=gene00001;Name=EDEN.1
ctg123 . CDS                1201  1500  .  +  0  ID=cds00001;Parent=mRNA00001
ctg123 . CDS                3000  3902  .  +  0  ID=cds00001;Parent=mRNA00001
ctg123 . CDS                5000  5500  .  +  0  ID=cds00001;Parent=mRNA00001
ctg123 . CDS                7000  7600  .  +  0  ID=cds00001;Parent=mRNA00001
###
ctg123 . gene               11000  91000  .  +  .  ID=gene00002
ctg123 . mRNA               11050  91000  .  +  .  ID=mRNA00002;Parent=gene00002
ctg123 . CDS                11201  11500  .  +  0  ID=cds00002;Parent=mRNA00002
ctg123 . CDS                31000  31902  .  +  0  ID=cds00002;Parent=mRNA00002
ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by Juke-343.1k

It seems Bio::Tools::GFF can only output features. The solution is to output the non-feature lines yourself. If you want to use forward reference closing (###), you could output relevant features with Bio::Tools::GFF then output ### then output the next batch of features and so on. If that's not convenient, you could output all features then insert the non-feature lines. You would need to keep track of where they should go in the file. Also you could do this in memory before actually printing everything to file.

ADD REPLYlink written 3.4 years ago by Jean-Karim Heriche21k

I wanted to print directly in a file, but to put everything in memory before to print it seems to be the only convenient solution. I'm a bit worry because my files are over 2gb ... so I hope my computer can digest it easily.

ADD REPLYlink written 3.4 years ago by Juke-343.1k
2
gravatar for Lee Katz
3.4 years ago by
Lee Katz3.0k
Atlanta, GA
Lee Katz3.0k wrote:

Here is the code that actually works after I did some digging. It's a little bit of a hack but I don't believe it will cause any problems. Also I changed it to Bio::FeatureIO which I think is a little more mainstream. In this one-liner, ### is printed to the output file after each input feature, but you can change it to however you want.

perl -MBio::FeatureIO -e '
  $in=Bio::FeatureIO->new(-file=>"in.gff"); 
  $out=Bio::FeatureIO->new(-file=>">tmp.gff"); 
  $fh=$out->{"_filehandle"}; 
  while($feat=$in->next_feature){
    $out->write_feature($feat); 
    print $fh "###\n";
  }
'

The method fh() actually performs some tie and Symbol::gensym magic which I found in the source code. This magic forces you to use a feature object when you write to the file.

sub fh {
  my $self = shift;
  my $class = ref($self) || $self;
  my $s = Symbol::gensym;
  tie $$s,$class,$self;
  return $s;
}

To get around that, I found that the filehandle is actually embedded in the object's variables, under _filehandle. Therefore you can exploit $out->{"_filehandle"} by printing to it directly.

ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by Lee Katz3.0k

It's exactly what I was looking for !! Thank you so much !

ADD REPLYlink written 3.4 years ago by Juke-343.1k
0
gravatar for Lee Katz
3.4 years ago by
Lee Katz3.0k
Atlanta, GA
Lee Katz3.0k wrote:

Jean-Karim is correct that the GFF module will probably only output features. However, you probably can manually write to the file in your perl script! There is a method called $obj->fh which returns the filehandle of your GFF object. Therefore you could write to the file as desired. For example:

$gffObj->write_feature($something);
# Get the filehandle
my $fh=$gffObj->fh;
# Print comments to the file
print $fh "###\n";
$gffObj->write_feature($something2);
$gffObj->close
ADD COMMENTlink written 3.4 years ago by Lee Katz3.0k

It would be nice, but it doesn't work. It as well done to print a feature object. The error is : Can't locate object method "strand" via package "### ...

I'm afraid that the only solution will be to work on bioperl to fix it.

ADD REPLYlink written 3.4 years ago by Juke-343.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 750 users visited in the last hour