I am looking for a comprehensive list of gene positions on GI sequences. I have found gene2accession, but this file only seems to contain start and stop positions for particular genes.
To give a concrete example, gene 5908888 (“Fphi_0526 proline:Na+ symporter [ Francisella philomiragia subsp. philomiragia ATCC 25017 ]”) appears in both CP000937.1 and NC_010336.1 (according to gene2accesion, and this is consistent with the corresponding GenBank entries). However, it is only for NC_010336.1 that gene2accession lists coordinates, although coordinates are specified in the GenBank entry for CP000937.1 too (not using the gene ID, interestingly, but the correct locus_tag, and the positions are identical).
Do I have to download the whole of genbank in order to achieve that goal?
CP000937 is the GenBank entry, NC_010336 is RefSeq and they are not the same database. you can read more here: http://www.ncbi.nlm.nih.gov/books/NBK50679/#RefSeqFAQ.what_is_the_difference_betwe_2