I followed this advice to subset the nr database with a specific taxonomic group: Vertebrate Subset Nr Database? Build My Own?
However, even though there were about 5 million GIs, the resulting database only ended up being 2 million sequences. Is this working as intended? Both the nr database and the GIs have been downloaded with only a day between, so I don't think someone placed 3 million sequences within that time period.
>blastdb_aliastool -gilist virus.gi_list180712.txt -db nr -out nr_virus -title nr_virus Converted 4764026 GIs from virus.gi_list180712.txt to binary format in nr_virus.p.gil Created protein BLAST (alias) database nr_virus with 2239853 sequences