Add taxID to makeblastdb
1
3
Entering edit mode
4.3 years ago

I'm sorry if this is a repeated question, but I continue to have doubts.

I created a blastdb like this:

makeblastdb -in input.fasta -dbtype nucl -title test_DB -parse_seqids -out test_DB

But I can't understand how to add the taxid in order to have the same result has if I used the all nt database.

Can someone clarify me? Thanks

blastdb windows • 3.3k views
ADD COMMENT
0
Entering edit mode

Can you show what your input fasta headers look like? grep "^>" input.fasta | head -5.

ADD REPLY
3
Entering edit mode
4.2 years ago
gb ★ 2.2k

your command should look like:

makeblastdb -in input.fasta -dbtype nucl -title test_DB -parse_seqids -taxid_map taxidmapfile -out test_DB

The taxidmap file is a text file consisting of two columns. You can download the taxonomy id information here:

ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/nucl_gb.accession2taxid.gz

You need to unpack it and you can make a taxidmap file by doing (something like):

sed '1d' nucl_gb.accession2taxid | awk '{print $2" "$3}' > taxidmapfile
ADD COMMENT
0
Entering edit mode

I did has you sad but then I received this error messagem: [makeblastdb] No sequences matched any of the taxids provided...

ADD REPLY
1
Entering edit mode

ah! That is possible, depending on your input fasta. The fasta headers need to be in a certain format (the format like in sequences from genbank). Can you show how the first headers look like in your fasta?

EDIT:

Here you may also find your answer How to make a custom blast db with taxon IDs from a taxid_map file

ADD REPLY
1
Entering edit mode

My fasta file looks like this: >NC_028405.1_COX1: I managed to solve this problem!Thanks for the help. But now I have another doubt, cause I finally managed to get the database that I want, but I got a different result and worst than when I used all NT. When I use a 'pre-made' database, shouldn't the results be better and more precise? Thanks for the help again.

ADD REPLY
0
Entering edit mode

When I use a 'pre-made' database, shouldn't the results be better and more precise?

In what way?

BLAST results are very dependent on search space (database size). This is vastly different between nt and any custom database you make so the results will be different (if you are looking at e values and such).

ADD REPLY

Login before adding your answer.

Traffic: 2523 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6