Question: BLAST Database error: No GIs were found in BLAST database
0
gravatar for anasofiamoreira94
8 weeks ago by
anasofiamoreira9430 wrote:

Hello, I'm trying to filter out some sequences from the nt database from ncbi.This is how I went with:

  • 1-Download the prebuilt nt database
  • 2-search entriz nucleotide database with query: "taxid3708[ORGN]"
  • 3-Select "Send to File" and choose format "GI list"
  • 4-Use the list of GIs from the previous step with: blastdb_aliastool -gilist sequence.txt -db nt_v5 -out nt_allergen -dbtype nucl

However, when I use this command line I get this error: BLAST Database error: No GIs were found in BLAST database

Here are some IDS from my GIlist: 1376310040 1179788179 1464315148 1551319539 1534512279

Am I retrieving the GI list the righ way? Thanks for the help.

blast next-gen • 151 views
ADD COMMENTlink modified 17 days ago by Biostar ♦♦ 20 • written 8 weeks ago by anasofiamoreira9430

If you are looking to restrict blast results to that taxID then why not use the new blast+ option (available with v.2.8.1):

 -taxids <String>
   Restrict search of database to include only the specified taxonomy IDs
   (multiple IDs delimited by ',')
ADD REPLYlink written 8 weeks ago by genomax65k

According to ftp://ftp.ncbi.nlm.nih.gov/blast/db/README nt is:

Partially non-redundant nucleotide sequences from all traditional divisions of GenBank, EMBL, and DDBJ excluding GSS,STS, PAT, EST, HTG, and WGS.

Your first GI corresponds to MF401153.1, which is identical to MF401152.1, except in position 146 MF401153.1 has N and MF401152.1 has T. The latter is thus objectively of higher quality and probably present in nt

BTW, I thought NCBI had phased out GI's already..

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by 5heikki8.4k

Yes they have phased out GI for most purposes. So using accession numbers is definitely the preferred way. I will quote the following from NCBI.

So are the GI numbers gone for good?

No! They are still part of the data record, and you will still be able to use them to retrieve the record on the web or using the E-utilities, indefinitely. They will remain in XML and ASN.1 data presentations, and will only be removed from flat files and FASTA.

However, more and more new sequence records will not be assigned a GI number, and so will never be retrievable using GI methods. But records that currently have a GI will always have that GI.

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by genomax65k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1096 users visited in the last hour