NCBI Datasets CLI Question
1
0
Entering edit mode
10 weeks ago
Bjorn • 0

Hello,

I am new to using the datasets command line. I wish to download a subset of all Pseudomonas genomes from ncbi datasets. If I run the command: datasets download genome taxon 286 --dehydrated --include genome,gff3 --filename Pseudomonas_Whole_Genera_NCBI/genomes.zip

This allows a successful download all of the Pseudomonas genomes.

However, this is long and takes up too much storage on my computer. Is there anyway I can specify only to download a random sample of 200 of the total genomes in Pseudomonas?

I could manually label all accession IDs but is there quick/manageable way to get these? Are there other options? Any help would be appreciated.

Best,
B

ncbi-datasets • 905 views
ADD COMMENT
3
Entering edit mode
10 weeks ago
GenoMax 153k

I don't see a way to specify a random number of accessions to download as an option. Perhaps you may want to open an issue and suggest that as a feature request to the dev team.

You best bet may be to do datasets summary genome taxon 286 and select number of genome accessions you want to get (via dataformat) and then go back to your original command and specify the accessions on that command line (datasets download genome accession ACC1 AAC2 ... ACC200.

Edit: Following will get you 200 RefSeq genome ID's that you can capture.

 ./datasets summary genome taxon 286 --limit 200 --reference --assembly-source RefSeq --report ids_only --as-json-lines | ./dataformat tsv genome --fields accession

Or optionally use https://www.ncbi.nlm.nih.gov/datasets/genome/?taxon=287 (use the dropdown to display 100 rows in table so 2 pages should get you 200 accessions) to either grab the accessions numbers of genomes from the table or download the genomes from that GUI itself.

ADD COMMENT
0
Entering edit mode

Maybe make this an answer rather than a comment?

ADD REPLY
0
Entering edit mode

Wanted to make it a complete answer with an additional command. Moved now.

ADD REPLY
0
Entering edit mode

Thanks! Works great.

ADD REPLY

Login before adding your answer.

Traffic: 3305 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6