Tool:Download all refseq/genbank bacterial genomes from NCBI
0
1
Entering edit mode
7.3 years ago
johnsrc06 ▴ 10

I've been trying to find an EASY way to download all genomes (fasta, genbank, gff, etc.) from NCBI's refseq or genbank. I decided to write my own program in python to help make the process much easier and flexible for researchers. Let me know if this helps you or if you have any suggestions:

https://github.com/ryjohnson09/bacteria_genome_pull

genome sequencing • 3.6k views
ADD COMMENT
1
Entering edit mode

WouterDeCoster : This could be left classified as a tool as the OP had done. It is a ready to use script that anyone is able to use as is.

ADD REPLY
0
Entering edit mode

Oh yeah you are right, guess I'm getting used to adjusting post classification but should've read more carefully here.

ADD REPLY
1
Entering edit mode

Perhaps this tool does what you are looking for: https://github.com/kblin/ncbi-genome-download

ADD REPLY
0
Entering edit mode

johnsrc06 : I have not tried your script but you should add some notes about how long it takes to run (since you appear to be parsing the actual directories, is that correct?).

ADD REPLY
0
Entering edit mode

Thanks for the suggestions (I'm new to Biostars)...I've changed it to tool and will run a few tests to add some details about time consumption.

ADD REPLY
0
Entering edit mode

Just a thought. You may want to consider parsing the genome summary file that lists contents of RefSeq genomes (it is at the ftp site). It may be faster than parsing the directories.

ADD REPLY
0
Entering edit mode

I had a look at your code and it sure looks decent. I tried your example usage from Github and it did the job here. If you would rewrite print "something" to print("something") your code would also be compatible with python3.

ADD REPLY
0
Entering edit mode

Good suggestion...i'll put it on my to-do list. Thanks!

ADD REPLY
0
Entering edit mode

Can you amend this tool easily to allow for downloads of whole genome protein sequence files? I noticed that currently you only allow genbank/fasta/gff files.

ADD REPLY

Login before adding your answer.

Traffic: 2934 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6