I am seeking to download every available protein sequence for a series of organisms and all of their strains. PATRIC offers more strains than NCBI has listed (as far as I can tell, feel free to correct that and please indicate how I can find the corresponding sequences on NCBI), but I can't tell how to download them without doing so manually. For the 65 complete sequences of Bacillus anthracis, each has >5000 protein families. Copying and pasting after manually downloading them isn't a scalable solution.
NCBI has entrez. Does anyone know of some similar ability with PATRIC?
Link to PATRIC database for set of protein families belonging to one strain:
https://www.patricbrc.org/view/Genome/743835.4#view_tab=proteinFamilies
That table is downloadable (by selecting all via the check mark at top left corner) and then selecting "Download" (right corner) to download the table as text/CSV.
That downloads the table. I want the fasta sequences. That is also not a command line option that would be scalable to an arbitrary number of organisms and their strains.
Some of the data (that you see in the web front end) may be derived from primary data and there may be no way to download it automatically (you could try writing to the site owners to see if they can export some of the data on backend for you). All primary sequence data appears to be available via FTP at link below. There are tens of thousands of genomes and you may have to be patient as you download the data since that FTP site does not appear to be very fast.