Question: How Can I Get Sequences From A Gff3 File?
3
gravatar for David B
8.7 years ago by
David B70
David B70 wrote:

As you know, GFF3 files can contain FASTA sequences after the feature table.

How do I extract a specific FASTA sequence given it's ID?

my $gffio = Bio::Tools::GFF->new(
    -file =>
        "/path/to/file.gff",
    -gff_version => 3
);

print Dumper $gffio->get_seqs();

but it's seems null, although the GFF3 has sequences and is also valid. I am abke to parse the features themselves (using $gffio->next_feature()), but not the sequences ($gffio->get_seqs());

gff sequence bioperl • 3.2k views
ADD COMMENTlink modified 8.7 years ago by brentp23k • written 8.7 years ago by David B70
3
gravatar for brentp
8.7 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

For some (undocumented) reason, the bioperl gff parser wants you to iterate through all the features before getting the sequences. Of course it's likely this limitation is because the sequences appear at the end, but it's not explained to be the case as far as I could see. I was able to get your example to work by adding:

while($gffio->next_feature()){ }

to the line before the print.

ADD COMMENTlink written 8.7 years ago by brentp23k

Thanks, this seems to work, but the other option to get sequences - $gffio->seq_id_by_h() - (documented here:http://search.cpan.org/~sendu/bioperl/Bio/Tools/GFF.pm#GFF3_AND_SEQUENCE_DATA) does not... very strange.

ADD REPLYlink written 8.7 years ago by David B70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1410 users visited in the last hour