2.8 years ago by
San Francisco, CA
Unfortunately, GFF3 still hasn't been added to NCBI's E-utilities as a valid return type, despite having been added to the web tool a year or more ago. That said, we can take advantage of the web-based GFF retrieval tool directly – after inspecting network traffic while pulling GFFs from the NCBI web portal and playing around with the parameters, I was able to reverse engineer how to retrieve a GFF file given an accession number. The results can be retrieved using your favorite file retrieval tool (
cURL, etc.). Here's how I do it using
wget -O /path/to/your.gff "https://www.ncbi.nlm.nih.gov/sviewer/viewer.cgi?db=nuccore&report=gff3&id=<acc[.ver]>"
<acc.ver> in the example query string above should be replaced with your
N.B.: It's relatively straightforward to pull multiple GFFs from separate entries using a comma-separated list of identifiers, but I haven't stress tested this, nor have I slammed NCBI with so many queries that NCBI would feel compelled to block this type of web request. Here's a multi-identifier example:
wget -O Human_picobirnavirus.gff "https://www.ncbi.nlm.nih.gov/sviewer/viewer.cgi?db=nuccore&report=gff3&id=NC_007026.1,NC_007027.1"