Question: Taxid will not function with other databases (including custom)
0
gravatar for emilychase
4 weeks ago by
emilychase0
Aix_Marseille Université, Marseille, France.
emilychase0 wrote:

Hello/Bonjour

I cannot get taxid to work with nr, or my custom database (the custom database does work); I cannot get the output to include the results for staxids.

Error: Warning: [blastx] Taxonomy name lookup from taxid requires installation of taxdb database with ftp://ftp.ncbi.nlm.nih.gov/blast/db/taxdb.tar.gz

The taxdb.btd and taxdb.bti are both in my BLASTDB dir

Code:

  blastx -query SPAdes/contigs.fasta -db ../../BLASTDB/nr -outfmt "6 qseqid sseqid pident qlen length mismatch gapope evalue bitscore staxids sscinames" -num_threads 24 -out D446_S2_viral_fraction_nr_taxadb_test.blastx -max_target_seqs 20

Any help is appreciated!

linux blast+ blast • 118 views
ADD COMMENTlink modified 29 days ago by gb540 • written 4 weeks ago by emilychase0

Did you set the BLASTDB environment variable?

Scientific Names In Blast Output And Databases

ADD REPLYlink written 4 weeks ago by h.mon22k

It was set when I installed Blast originally, do I need to do it again/another way now that I have added the 2 taxdb files to the same dir?

ADD REPLYlink written 4 weeks ago by emilychase0

What is the result of:

echo $BLASTDB

and:

ls -lh $BLASTDB
ADD REPLYlink written 4 weeks ago by h.mon22k
echo $BLASTDB

:/home/emily/blast/bin/

ls -lh $BLASTDB

Is all the files in my home dir

ADD REPLYlink written 4 weeks ago by emilychase0

Does echo $BLASTDB really have : at the beginning?

The result of ls -lh $BLASTDB should be the contents of the folder where your blast databases are located, not your home folder.

ADD REPLYlink written 4 weeks ago by h.mon22k

Fixed both of those, thank you. It still won't return taxaID with a custom db though.

ADD REPLYlink written 4 weeks ago by emilychase0

With custom database I suppose you made a database with the makeblastdb command. Did you used sequences from genbank for that or from an other source? And did you added taxaid's when you made the database?

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by gb540

Yes, it was made the makeblastdb, and the contents are a fraction or rn - so yes GenBank. I did not add taxaIDs, but the taxadb is in the same dir as the custom db.

ADD REPLYlink written 29 days ago by emilychase0
2
gravatar for gb
29 days ago by
gb540
gb540 wrote:

You need to add the taxonIDs when you make the database.

I think you first need to download this file: ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/prot.accession2taxid.gz

After that you need to extract two columns:

sed '1d' prot.accession2taxid | awk '{print $2" "$3}' > accession_taxonid

Then you make the database like this:

sudo makeblastdb -in yourseqs.fa -dbtype prot -taxid_map accession_taxonid -parse_seqids

I have never done it with protein data, but I think it is the same as the nt.

EDIT: I think the process of adding the taxonIDs consumes a lot of memory. If it does not work blast will not give an error, so keep that in mind. If memory is a problem you first need to extract the accessions that you have from accession_taxonid and try it again.

ADD COMMENTlink modified 29 days ago • written 29 days ago by gb540

This makes sense, thank you!

ADD REPLYlink written 29 days ago by emilychase0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1181 users visited in the last hour