Question: Taxid will not function with other databases (including custom)
0
gravatar for emilyc
7 months ago by
emilyc0
emilyc0 wrote:

Hello/Bonjour

I cannot get taxid to work with nr, or my custom database (the custom database does work); I cannot get the output to include the results for staxids.

Error: Warning: [blastx] Taxonomy name lookup from taxid requires installation of taxdb database with ftp://ftp.ncbi.nlm.nih.gov/blast/db/taxdb.tar.gz

The taxdb.btd and taxdb.bti are both in my BLASTDB dir

Code:

  blastx -query SPAdes/contigs.fasta -db ../../BLASTDB/nr -outfmt "6 qseqid sseqid pident qlen length mismatch gapope evalue bitscore staxids sscinames" -num_threads 24 -out D446_S2_viral_fraction_nr_taxadb_test.blastx -max_target_seqs 20

Any help is appreciated!

linux blast+ blast • 294 views
ADD COMMENTlink modified 7 months ago by gb780 • written 7 months ago by emilyc0

Did you set the BLASTDB environment variable?

Scientific Names In Blast Output And Databases

ADD REPLYlink written 7 months ago by h.mon25k

It was set when I installed Blast originally, do I need to do it again/another way now that I have added the 2 taxdb files to the same dir?

ADD REPLYlink written 7 months ago by emilyc0

What is the result of:

echo $BLASTDB

and:

ls -lh $BLASTDB
ADD REPLYlink written 7 months ago by h.mon25k
echo $BLASTDB

:/home/emily/blast/bin/

ls -lh $BLASTDB

Is all the files in my home dir

ADD REPLYlink written 7 months ago by emilyc0

Does echo $BLASTDB really have : at the beginning?

The result of ls -lh $BLASTDB should be the contents of the folder where your blast databases are located, not your home folder.

ADD REPLYlink written 7 months ago by h.mon25k

Fixed both of those, thank you. It still won't return taxaID with a custom db though.

ADD REPLYlink written 7 months ago by emilyc0

With custom database I suppose you made a database with the makeblastdb command. Did you used sequences from genbank for that or from an other source? And did you added taxaid's when you made the database?

ADD REPLYlink modified 7 months ago • written 7 months ago by gb780

Yes, it was made the makeblastdb, and the contents are a fraction or rn - so yes GenBank. I did not add taxaIDs, but the taxadb is in the same dir as the custom db.

ADD REPLYlink written 7 months ago by emilyc0
2
gravatar for gb
7 months ago by
gb780
gb780 wrote:

You need to add the taxonIDs when you make the database.

I think you first need to download this file: ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/prot.accession2taxid.gz

After that you need to extract two columns:

sed '1d' prot.accession2taxid | awk '{print $2" "$3}' > accession_taxonid

Then you make the database like this:

sudo makeblastdb -in yourseqs.fa -dbtype prot -taxid_map accession_taxonid -parse_seqids

I have never done it with protein data, but I think it is the same as the nt.

EDIT: I think the process of adding the taxonIDs consumes a lot of memory. If it does not work blast will not give an error, so keep that in mind. If memory is a problem you first need to extract the accessions that you have from accession_taxonid and try it again.

ADD COMMENTlink modified 7 months ago • written 7 months ago by gb780

This makes sense, thank you!

ADD REPLYlink written 7 months ago by emilyc0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1649 users visited in the last hour