Blast+ remote database names
0
0
Entering edit mode
2.5 years ago

I am trying to figure out what are the database names supported by blast+ 2.12.0 version remote command. Specifically, I want to get 16S and 18S database names.

The options I have tried so far are:

  1. 16S_ribosomal_RNA
  2. rRNA_typestrains
  3. 16S ribosomal RNA (Bacteria and Archaea type strains)
  4. 16SMicrobial
  5. rRNA/ITS

But it is not working... I searched for a documentation/question on this... nothing is available... Even on FTP the name is 16S_ribosomal_RNA... but the database name is not being recognized

For all of the above the same error is generated- "BLAST Database error: '16S_ribosomal_RNA' not found on NCBI servers"

blast remote databases Blastplus • 2.8k views
ADD COMMENT
1
Entering edit mode

Interesting observation (i.e. no info on which db are supported for remote blast). You should email blast helpdesk to get an authoritative answer and then post that here. Remote search may only support nt,nr,pbdaa and swissprot databases.

ADD REPLY
1
Entering edit mode

Maybe:

update_blastdb --showall
Connected to NCBI
16S_ribosomal_RNA
18S_fungal_sequences
28S_fungal_sequences
Betacoronavirus
ITS_RefSeq_Fungi
ITS_eukaryote_sequences
LSU_eukaryote_rRNA
LSU_prokaryote_rRNA
SSU_eukaryote_rRNA
cdd_delta
env_nr
env_nt
human_genome
landmark
mito
mouse_genome
nr
nt
pataa
patnt
pdbaa
pdbnt
ref_euk_rep_genomes
ref_prok_rep_genomes
ref_viroids_rep_genomes
ref_viruses_rep_genomes
refseq_protein
refseq_rna
refseq_select_prot
refseq_select_rna
swissprot
taxdb
tsa_nr
tsa_nt
ADD REPLY
1
Entering edit mode

This list does not fully work. As noted by OP (and tested by me) 16S_ribosomal_RNA (several other names at top of list, 16S, 18S, ITS, LSU, SSU) simply generate an error that the database was not found on NCBI servers.

Of ones tested nt, nr, pdbnt, pdbaa, refseq_protein and swissprot work.

ADD REPLY
0
Entering edit mode

Is there any update on this issue?

Im trying to use the human_genome database with the remote option but I still get:

BLAST Database error: 'human_genome' not found on NCBI servers.

ADD REPLY
0
Entering edit mode

Update:

Interestingly, typing GPIPE/9606/current/ref_top_level instead of human_genome works for me:

blastn -query my_sequences.fa -db GPIPE/9606/current/ref_top_level -outfmt 6 -remote

One "problem" here is that I receive this error message:

Error: [blastn] Failed to fetch sequences in batch mode

In practice, though, all individual fastas of my multi-fasta query seem to be normally fetched and are included in the blast output.

edit: Just to clarify, I don't know if this solution is appropriate for end users, and I don't urge anyone to apply it. It's just an interesting observation.

ADD REPLY
1
Entering edit mode

I just wanted to pipe in to mention that I'm also getting the Error: [blastn] Failed to fetch sequences in batch mode message. I'm running a similar local blast query using -db GPIPE/9669/102/ref_top_level (I'm BLASTn-ing against the ferret GPIPE accessions in a way advised by one of the NCBI employees when I reached out asking about GPIPE). Just like you, I seem to be getting the desired output hits, but I wanted to "second" your report about getting that message in case it's useful to anyone in the future who finds this thread.

ADD REPLY
0
Entering edit mode

in a way advised by one of the NCBI employees when I reached out asking about GPIPE

Interesting. So blast support actually told you to use this type of PATH for database for remote blast?

ADD REPLY
0
Entering edit mode

How did you figure out GPIPE/9606/current/ref_top_level path? It may be something NCBI has for internal use and may not be intended to be used by us.

ADD REPLY
0
Entering edit mode

In web-BLAST, I searched a query sequence against refseq_genomes database while limiting results to Homo sapiens. I found the GPIPE/9606/current/ref_top_level path in the Database field of the results page:

enter image description here

Just to clarify, I don't know if this solution is appropriate for end users, and I don't urge anyone to apply it, but I've had some trouble finding a better/faster alternative.

The following approach seems more appropriate,

blastn -query my_sequences.fa -db refseq_genomes -outfmt 6 -entrez_query="txid9606[Organism]" -remote

though it's significantly slower and I still get this same error:

Error: [blastn] Failed to fetch sequences in batch mode
ADD REPLY

Login before adding your answer.

Traffic: 1857 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6