Question: BLAST Database error: No GIs were found in BLAST database
gravatar for anasofiamoreira94
11 months ago by
anasofiamoreira9450 wrote:

Hello, I'm trying to filter out some sequences from the nt database from ncbi.This is how I went with:

  • 1-Download the prebuilt nt database
  • 2-search entriz nucleotide database with query: "taxid3708[ORGN]"
  • 3-Select "Send to File" and choose format "GI list"
  • 4-Use the list of GIs from the previous step with: blastdb_aliastool -gilist sequence.txt -db nt_v5 -out nt_allergen -dbtype nucl

However, when I use this command line I get this error: BLAST Database error: No GIs were found in BLAST database

Here are some IDS from my GIlist: 1376310040 1179788179 1464315148 1551319539 1534512279

Am I retrieving the GI list the righ way? Thanks for the help.

blast next-gen • 501 views
ADD COMMENTlink modified 9 months ago by Biostar ♦♦ 20 • written 11 months ago by anasofiamoreira9450

If you are looking to restrict blast results to that taxID then why not use the new blast+ option (available with v.2.8.1):

 -taxids <String>
   Restrict search of database to include only the specified taxonomy IDs
   (multiple IDs delimited by ',')
ADD REPLYlink written 11 months ago by genomax77k

According to nt is:

Partially non-redundant nucleotide sequences from all traditional divisions of GenBank, EMBL, and DDBJ excluding GSS,STS, PAT, EST, HTG, and WGS.

Your first GI corresponds to MF401153.1, which is identical to MF401152.1, except in position 146 MF401153.1 has N and MF401152.1 has T. The latter is thus objectively of higher quality and probably present in nt

BTW, I thought NCBI had phased out GI's already..

ADD REPLYlink modified 11 months ago • written 11 months ago by 5heikki8.6k

Yes they have phased out GI for most purposes. So using accession numbers is definitely the preferred way. I will quote the following from NCBI.

So are the GI numbers gone for good?

No! They are still part of the data record, and you will still be able to use them to retrieve the record on the web or using the E-utilities, indefinitely. They will remain in XML and ASN.1 data presentations, and will only be removed from flat files and FASTA.

However, more and more new sequence records will not be assigned a GI number, and so will never be retrievable using GI methods. But records that currently have a GI will always have that GI.

ADD REPLYlink modified 11 months ago • written 11 months ago by genomax77k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1460 users visited in the last hour