subset ncbi nt db?
1
0
Entering edit mode
4 months ago
Eugene A ▴ 180

Hi everyone,

I was wondering, is it possible to download nt db only for given taxid (species)? Or at least create as subDB after downloading and delete the rest of nt? I'd like to blast against human sequences so I do not want to store whole nt_euk on my server. But I failed to find any options to specify taxid in update_blastdb.pl script, as wall as failed to find a collection homo sapience sequences for usage with blast+ Are there any workaround here?

Best, Eugene

ncbi blast nt • 436 views
ADD COMMENT
1
Entering edit mode
4 months ago
GenoMax 141k

Are there any workaround here?

No. If you want to subset nt to contain only human sequences you will need to download the entire database and then do this to extract human sequences

blastdbcmd -db nt -taxids 9606 -outfmt %f  -out human.fa

You can them make a new blast index using human.fa.

Downloading nt_euk indexes may be a smaller download compared to full nt. I don't know if that contains all human sequences that are in full nt.

If you do end up downloading pre-made nt indexes then you could limit your blast searches using human taxID (without subsetting) unless you can't keep full nt indexes around and must save space.

ADD COMMENT

Login before adding your answer.

Traffic: 1773 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6