Filtering sequence from NCBI nt database and downloading sequence from NCBI WGS
1
0
Entering edit mode
3.4 years ago

Dear all,

I am currently building a database index to be used for taxonomy assignment. I am hoping to be able to filter off sequence from NCBI non-redundant nucleotide database (nt database) and to download FASTA sequence from NCBI wgs database based on taxonomy ID and include them as part of my index. Any idea which software can do the above work? I came across this deprecation software draftGenome (https://github.com/khyox/draftGenomes) but it is not working anymore.

Thank you and I look forward to receiving all suggestions.

Best, Lim

genome • 773 views
ADD COMMENT
0
Entering edit mode
3.4 years ago
GenoMax 141k

See answer here for WGS part : A: How to get BLAST wgs database

As for limiting your searches to specific taxID with nt database look into the -taxids or -taxidlist option for blast+.

ADD COMMENT
0
Entering edit mode

Thank you for your suggestion! Sorry I am relatively new at this, please correct me if I understand it wrongly.

For the WGS part, most of the links in the post are not working anymore but I managed to find the new Readme file for wgs database (https://ftp.ncbi.nlm.nih.gov/blast/WGS_TOOLS/README_BLASTWGS.txt). If I understand the instruction correctly, it only creates an alias file based on taxonomy ID provided, not downloading the sequences from database. I am hoping to download the sequences from WGS database, integrate it with nt database and build a index for taxonomy assignment on my local machine (Tools:Centrifuge, not blast). But I will give it a try if it works.

ADD REPLY

Login before adding your answer.

Traffic: 2221 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6