Question: Local blast limit query search by GI list?
gravatar for milt0n
4.5 years ago by
milt0n0 wrote:

Hi All,  

I am trying to blast a file that contains about 42k fasta sequences against a local blast database (nt), and I would like to restrict the search space. I read that a common way to do that is to restrict the search using "gi" (see command line below).

My question is: How would you go about to obtain a list of gi striclty for bacteriophage related nucleotide sequences? What I have done before is going to the NCBI nucleotide database, searching for "bacteriophage", then exporting the list of results to a gi file. But I am not sure if this is the way to do it as I get also other results (other microbes). 

Or am I going about this wrong?

Thanks for you help,


$ blastn -db nt -gilist -query seq.fasta -out blast_results.txt
blast latest bacteriophage • 2.3k views
ADD COMMENTlink modified 4.5 years ago by genomax87k • written 4.5 years ago by milt0n0
gravatar for genomax
4.5 years ago by
United States
genomax87k wrote:

That is the right way to do this. Getting all viral genomes and parsing out bacteriophage gi's may be preferred option. I see 1700+ entries for phages.

$ grep "phage" viral.1.1.genomic.fna | awk -F "|" '{print $2}' > phage_gi_list

should do it.

You could try the taxonomy ID route to get a more restricted set of gi: I am not sure if that option gives you all bacteriophages though.

ADD COMMENTlink modified 7 months ago by RamRS28k • written 4.5 years ago by genomax87k

Thanks, the first link is indeed too restrictive, but I see what you mean. I'll explore a bit further.


ADD REPLYlink written 4.5 years ago by milt0n0

Go with the viral genomes option. I will move it up in the post above.

ADD REPLYlink written 4.5 years ago by genomax87k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1359 users visited in the last hour