I am trying to blast a file that contains about 42k fasta sequences against a local blast database (nt), and I would like to restrict the search space. I read that a common way to do that is to restrict the search using "gi" (see command line below).
My question is: How would you go about to obtain a list of gi striclty for bacteriophage related nucleotide sequences? What I have done before is going to the NCBI nucleotide database, searching for "bacteriophage", then exporting the list of results to a gi file. But I am not sure if this is the way to do it as I get also other results (other microbes).
Or am I going about this wrong?
Thanks for you help,
$ blastn -db nt -gilist list.gi -query seq.fasta -out blast_results.txt