I am seemingly stuck with something that should be very simple and I hope I haven't overlooked something obvious.
Question: How can I make a valid Blast-database with Taxids from a NCBI query export?
What I have tried so far:
For a meta-genomics project I need a custom made blast database which I wish to generate from the result of the following NCBI Nucleotide query:
Viruses[Organism] AND srcdb_refseq[PROP] NOT cellular organisms [ORGN]
The result is 3986 entries which I exported and saved (via 'Send to') in FASTA and ASN1 format. (Both files are seemingly containing the right amount of entries) As this is a meta-genomics project I would love to have the taxon ids in the blastdb.
I was successful with making a valid blast database from the FASTA file using makeblastdb, but the FASTA header doesn't include taxids, hence I tried to make a blast database from the ASN1 export using the following command (it is not clear from the documentation which formats can be used to create the database):
$ makeblastdb -in AllViralDNARefSeq.asn1 -dbtype nucl -out ViralASN1 -title "All Viral RefSeq DNA from NCBI ASN1" Building a new DB, current time: 12/20/2011 10:37:28 New DB name: ViralASN1 New DB title: All Viral RefSeq DNA from NCBI ASN1 Sequence type: Nucleotide Keep Linkouts: T Keep MBits: T Maximum file size: 1073741824B Adding sequences from FASTA; **added 10 sequences** in 0.00906897 seconds.
As you can see, this does not work as it adds only 10 sequences.
Any help to get in the Taxonids is appreciated it doesn't have to be elegant, I just need the database from that query. I am using Blast+ 2.2.25