Question

NCBI Datasets CLI Question

0

Entering edit mode

4 months ago

Bjorn • 0

Hello,

I am new to using the datasets command line. I wish to download a subset of all Pseudomonas genomes from ncbi datasets. If I run the command: datasets download genome taxon 286 --dehydrated --include genome,gff3 --filename Pseudomonas_Whole_Genera_NCBI/genomes.zip

This allows a successful download all of the Pseudomonas genomes.

However, this is long and takes up too much storage on my computer. Is there anyway I can specify only to download a random sample of 200 of the total genomes in Pseudomonas?

I could manually label all accession IDs but is there quick/manageable way to get these? Are there other options? Any help would be appreciated.

Best,
B

ncbi-datasets • 1.3k views

ADD COMMENT • link 4 months ago by Bjorn • 0

score 3 · Accepted Answer · 2025-07-07

3

Entering edit mode

4 months ago

GenoMax 154k

I don't see a way to specify a random number of accessions to download as an option. Perhaps you may want to open an issue and suggest that as a feature request to the dev team.

You best bet may be to do datasets summary genome taxon 286 and select number of genome accessions you want to get (via dataformat) and then go back to your original command and specify the accessions on that command line (datasets download genome accession ACC1 AAC2 ... ACC200.

Edit: Following will get you 200 RefSeq genome ID's that you can capture.

 ./datasets summary genome taxon 286 --limit 200 --reference --assembly-source RefSeq --report ids_only --as-json-lines | ./dataformat tsv genome --fields accession

Or optionally use https://www.ncbi.nlm.nih.gov/datasets/genome/?taxon=287 (use the dropdown to display 100 rows in table so 2 pages should get you 200 accessions) to either grab the accessions numbers of genomes from the table or download the genomes from that GUI itself.

ADD COMMENT • link 4 months ago by GenoMax 154k

0

Entering edit mode

Maybe make this an answer rather than a comment?

ADD REPLY • link 4 months ago by Mensur Dlakic ★ 30k

0

Entering edit mode

Wanted to make it a complete answer with an additional command. Moved now.

ADD REPLY • link 4 months ago by GenoMax 154k

0

Entering edit mode

Thanks! Works great.

ADD REPLY • link 4 months ago by Bjorn • 0