I am wondering about the best practice in Biopython to work with large GFF3 files (>100 MB). As minimum requirement, I want to be able to quickly and randomly retrieve features by ID, name, and coordinates, and returned feature objects (SeqRecords?) should have the hierarchical feature structure (e.g. gene -> mRNA -> CDS) correctly mapped onto them. Write access would be nice but is optional. External dependencies should be kept minimal to increase portability to other systems.
I am currently thinking along the line of importing the GFF3 file into a BioSQL database (How? Can I use a SQLite database?) and then access this database with Biopython's BioSQL interface. GFFutils looks useful, but I would prefer to work with Biopython objects (e.g. Bio.SeqRecord).
I know how to do all this in BioPerl bp_seqfeature_load.pl and Bio::DB::SeqFeature::Store), but I am new to Biopython.
Edit: I changed the title a bit to better reflect my requirements.