I have a list of hundreds or thousands of RefSeqs, e.g. NC_000964.1
If I point my browser to https://www.ncbi.nlm.nih.gov/nuccore/NC_000964.1, I see the gbff file displayed and I can download it through various clicks. What I can't figure out is how to automate this, how to construct some url or ftp address I can just use wget or rsync on. Any advice?
Edit: the specific file I'm after is the one called GenBank(full) in the drop-down:
NCBI Datasets is designed specifically for something like this! You can query the bacterial genomes by taxids if you would like and download all of the data in one package. There is a command line tool that you can use on Windows, Mac or Linux machines to download sequence and annotation data starting from either taxids or accessions.
At this time, NCBI Datasets includes only the latest assemblies (see here). If you have a list of NCBI assembly accessions that are out of the scope of NCBI Datasets, you can download directly from the FTP paths as shown below. The NCBI Genomes FTP has a bunch of assembly_summary.txt files (this one, for example) that have the full FTP paths for assemblies that can be used with a tool like lftp to download GFF3 and GBFF files.
Technically, you can use Entrez Direct to download the GenBank flatfiles for a given set of nucleotide accessions like NC_000913 but it does not scale very well if you have many thousands of accessions. Note, there is a limit on the number of requests you can make using e-utilities API which can be increased by getting an API key as described here.
I was ultimately successful in using a combination of @vkkodali 's suggestion to use the NCBI datasets tool and @GenoMax's suggested use of EntrezDirect tools to autmate go from an NC_* style annotation to a GCF_* annotation, and then download fasta and gff. A big thanks to both from me.
For an accession such as NC_000913.3, I did esearch -db nuccore -query NC_000913 | elink -target assembly | esummary | xtract -pattern DocumentSummary -element RefSeq (note the dropping of whatever is after the dot) which yielded e.g. GCF_000005845.2. Then datasets download genome accession GCF_000005845.2 would download a file ncbi_dataset.zip containing what I needed. This didn't always work but out of about 3400 accessions it definitely worked the large majority of the time, which is all I needed.
There might have been more direct ways. For a given NC_* accession, it seems like pointing a browser to e.g. https://www.ncbi.nlm.nih.gov/nuccore/NC_000913.3/ works, so I considered trying to automate clicking through "Send to", maybe with something like AutoHotKey, but ultimately decided what I had here was good enough.