How to download all Pseudomonas aeruginosa Genomes from NCBI Genomes database?
2
0
Entering edit mode
3.5 years ago
Optimist ▴ 180

Hello All,

I want to download all the Genomes of Pseudomonas aeruginosa from NCBI genomes database. As of now (23/10/2020), there are 5556 genomes for species Pseudomonas aeruginosa.

Kindly let me know a way to download all of them. Preferably with strain name .

Thanking You

NCBI Genome-Assembly • 1.7k views
ADD COMMENT
1
Entering edit mode
ADD COMMENT
1
Entering edit mode
3.5 years ago
vkkodali_ncbi ★ 3.7k

You can download these data directly from NCBI using the Datasets tool. Check out: NCBI Datasets for more details.

ADD COMMENT
1
Entering edit mode

Note: Web interface for NCBI datasets only provides access to Eukaryotic genomes. Use command line option for all genomes including bacteria.

ADD REPLY
3
Entering edit mode

NCBI Datasets now provides access to data for viruses and prokaryotes, including Pseudomonas aeruginosa.

While our Genomes page is limited to a maximum of 1,000 genomes for a single download, you can use the datasets command-line tool to download 15,365 Pseudomonas aeruginosa genomes.

Since this is such a large dataset, at about 30 GB compressed for genome sequence and metadata, I recommend you try this simple three-step approach:

  1. Download a dehydrated data package for all Pseudomonas aeruginosa genomes, including genome sequence and metadata. This only includes metadata.

    datasets download genome taxon "pseudomonas aeruginosa" --exclude-genomic-cds --exclude-protein --exclude-gff3 --filename aeruginosa.zip --dehydrated

  2. Extract the downloaded package.

    unzip aeruginosa.zip -d aeruginosa

  3. Rehydrate the extracted package to get the genomic sequences.

    datasets rehydrate --directory aeruginosa/

ADD REPLY

Login before adding your answer.

Traffic: 2567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6