Remove bacteria data from nt database
0
0
Entering edit mode
4.5 years ago

Hi all, I want to remove the bacteria data from the all nt database. Can someone tell me what's the best way to remove it? Thanks

nt ncbi • 1.1k views
ADD COMMENT
1
Entering edit mode

As far as I can tell nt sequences are annotated at the Genus level. So only way you may be able to do this is to get those names and exclude ones that are bacteria.

ADD REPLY
0
Entering edit mode

It may be simpler to post-filter your results for bacteria instead?

As @lieven points out below

 -negative_taxids <String>
   Restrict search of database to everything except the specified taxonomy IDs
   (multiple IDs delimited by ',')

should work. Assuming nt is properly annotated bacterial taxID.

Edit: No sequences in nt appear to be annotated with taxID 2 so that idea is not going to work.

ADD REPLY
0
Entering edit mode

alternatively (if you are using the newest blast version) use the taxonomic filtering options and set that to only report eukaryotic hits. No need to modify your blastDB in this case

EDIT/update : though this seems to work on the NCBI webblast, there are indications this does not work on the (local) CLI version

ADD REPLY
0
Entering edit mode

I'm using blast locally

ADD REPLY
0
Entering edit mode

This would work if I add the Ids of the species to remove. But then again, they can change, so the result will be different.

ADD REPLY
0
Entering edit mode

Hi, I think the search within database should now be possible by limiting taxa even in offline BLAST.

See this NCBI webinar

And/or this post: https://ncbiinsights.ncbi.nlm.nih.gov/2019/01/04/blast-2-8-1-with-new-databases-and-better-performance/.

Bu t if you are after sequences, then I'm not aware of any option to extract the sequences directly from nt database. However, one possible way might be to list all accessions in nt (blastdbcmd), run them through entrez OR get yourself accession2taxid table, select which you want and then extract them using blastdbcmd.

GL

ADD REPLY

Login before adding your answer.

Traffic: 1960 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6