I am working on a comparison between current metagenomics tools, and I have troubles finding a good, complete and updated reference database. My dream would be a selection of bacterial genomes from NCBI RefSeq with representatives from each species, covering strains with high phylogenetic diversity, as proposed in GEBA. Another nice feature would be easy availability for downloading, since I don't find NCBI so user-friendly (not easy to select interesting genomes, downloading file by file with ftp takes ages, or I am simply not able to do it properly). The best option I have found is HMP, but I would prefer a complete bacterial database. Another option would be using SILVA, but I would like to compare performances on whole genomes rather than 16S only.
Do you know any free databases with these characteristics? What do people use as reference databases when dealing with metagenomics? Thanks in advance for any suggestion.