BLAST Database error: No GIs were found in BLAST database
0
0
Entering edit mode
5.2 years ago

Hello, I'm trying to filter out some sequences from the nt database from ncbi.This is how I went with:

  • 1-Download the prebuilt nt database
  • 2-search entriz nucleotide database with query: "taxid3708[ORGN]"
  • 3-Select "Send to File" and choose format "GI list"
  • 4-Use the list of GIs from the previous step with: blastdb_aliastool -gilist sequence.txt -db nt_v5 -out nt_allergen -dbtype nucl

However, when I use this command line I get this error: BLAST Database error: No GIs were found in BLAST database

Here are some IDS from my GIlist: 1376310040 1179788179 1464315148 1551319539 1534512279

Am I retrieving the GI list the righ way? Thanks for the help.

next-gen blast • 1.8k views
ADD COMMENT
0
Entering edit mode

If you are looking to restrict blast results to that taxID then why not use the new blast+ option (available with v.2.8.1):

 -taxids <String>
   Restrict search of database to include only the specified taxonomy IDs
   (multiple IDs delimited by ',')
ADD REPLY
0
Entering edit mode

According to ftp://ftp.ncbi.nlm.nih.gov/blast/db/README nt is:

Partially non-redundant nucleotide sequences from all traditional divisions of GenBank, EMBL, and DDBJ excluding GSS,STS, PAT, EST, HTG, and WGS.

Your first GI corresponds to MF401153.1, which is identical to MF401152.1, except in position 146 MF401153.1 has N and MF401152.1 has T. The latter is thus objectively of higher quality and probably present in nt

BTW, I thought NCBI had phased out GI's already..

ADD REPLY
0
Entering edit mode

Yes they have phased out GI for most purposes. So using accession numbers is definitely the preferred way. I will quote the following from NCBI.

So are the GI numbers gone for good?

No! They are still part of the data record, and you will still be able to use them to retrieve the record on the web or using the E-utilities, indefinitely. They will remain in XML and ASN.1 data presentations, and will only be removed from flat files and FASTA.

However, more and more new sequence records will not be assigned a GI number, and so will never be retrievable using GI methods. But records that currently have a GI will always have that GI.

ADD REPLY

Login before adding your answer.

Traffic: 2644 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6