Question

The fastest way to download a list of SRR accessions from Sequence Read Archive with sratoolkit

0

Entering edit mode

4.3 years ago

Denis ▴ 310

I've installed sratoolkit.2.10.0 at my home on a cluster. I have to download a numerous SRR accessions to my home directory. There are a several options, as i can understand:

fastq-dump
run prefetchutility, then convert resulted sra files to fastq by fastq-dump
fasterq-dump (able to use multi-threading, but if i'm correct can not employ list of SRR accessions as input)

Which option is the fastest? Could you please provide a command line example which will be suitable for my purposes?

I've found very useful a post here: download from SRA However it seems too old.

sequence genome • 8.7k views

ADD COMMENT • link updated 3.1 years ago by kostaspildish ▴ 20 • written 4.3 years ago by Denis ▴ 310

score 2 · Answer 1 · 2020-01-13

2

Entering edit mode

4.3 years ago

GenoMax 141k

Your best bet is to use this: Fast download of FASTQ files from the European Nucleotide Archive (ENA) instead.

ADD COMMENT • link 4.3 years ago by GenoMax 141k

2

Entering edit mode

This link covers the (in my opinion) two fastest options. The first is to download directly in fastq format from ENA, and the second is prefetch followed by parallel-fastq-dump. See the thread for details including code examples. Don't use any of the "dump" commands to download data directly, too slow and too unstable in my experience.

ADD REPLY • link 4.3 years ago by ATpoint 82k

score 2 · Answer 2 · 2021-03-17

2

Entering edit mode

3.1 years ago

kostaspildish ▴ 20

Assuming you are using bash, you can employ an accession list (or any text file with accessions separated by returns) for faster-dump as follows: cat SraAccList.txt | xargs fasterq-dump - this also takes parameters eg --outdir

ADD COMMENT • link 3.1 years ago by kostaspildish ▴ 20