Question: How To Download All Sra Samples At Once ?
gravatar for biorepine
5.9 years ago by
biorepine1.4k wrote:

Dear Biostars,

As you may know SRA is a repository for all types of sequencing data. I often times have to do manual download by copying links of every SRA dataset by hand and use wget. I am wondering is there any simplest approach than manual copying of links ? Thanx in advance

For ex: How can I download all the data related to SRP026197 ?

geo sra download • 31k views
ADD COMMENTlink modified 19 months ago by Federico Giorgi580 • written 5.9 years ago by biorepine1.4k

Have you tried the SRAdb package from bioconductor? It's been a while, but I think it can be used to do that sort of thing.

ADD REPLYlink written 5.9 years ago by Devon Ryan93k

Actually, SRA is the repository for sequence data, not GEO. There are links between the two databases, but your question is actually related to SRA.

ADD REPLYlink written 5.9 years ago by Sean Davis25k

oh yeah you are right. I will edit my question. thanx

ADD REPLYlink written 5.9 years ago by biorepine1.4k

here is another solution A: How to download raw sequence data from GEO/SRA

ADD REPLYlink written 5.4 years ago by Istvan Albert ♦♦ 82k
gravatar for Sean Davis
5.9 years ago by
Sean Davis25k
National Institutes of Health, Bethesda, MD
Sean Davis25k wrote:

In R:

srafile = getSRAdbFile()
con = dbConnect('SQLite',srafile)

Now we are ready to query the local SQLite database:


Results in:

        study    sample experiment       run                                                                                                           ftp
1   SRP026197 SRS449410  SRX311638 SRR913951
2   SRP026197 SRS449476  SRX311704 SRR914066
3   SRP026197 SRS449408  SRX311636 SRR913949
247 SRP026197 SRS449508  SRX311735 SRR914158
248 SRP026197 SRS449460  SRX311688 SRR914006
249 SRP026197 SRS449509  SRX311736 SRR914160

If you simply want to have R do the downloads for you, that is also straightforward:


If you have access to the aspera client command line utility, ascp, you can have R use it instead of ftp, resulting in much greater download speeds. See the help for getSRAfile for details.

ADD COMMENTlink written 5.9 years ago by Sean Davis25k

In my case, the solution above worked with some modifications - I had to install and load the DBI package first and then change the dbConnect line:

srafile = getSRAdbFile()
con = dbConnect(RSQLite::SQLite(), srafile)
listSRAfile('SRP026197', con)

Without these modifications I got the message "Error: unable to find an inherited method for function 'dbConnect' for signature '"character"'".

ADD REPLYlink written 4.6 years ago by adumitri70

hi .I use these codes But I have Problem :






srafile = getSRAdbFile()

con = dbConnect(RSQLite::SQLite(), srafile)

listSRAfile('SRP026197', con)

after Downloading I have this error Error in result_create(conn@ptr, statement) : database disk image is malformed

What should I do??

ADD REPLYlink modified 18 months ago • written 18 months ago by samane.0

Hi, it is working great! However, I couldn't find a way to retrieve the information (ex: A specific tissue RNA-Seq) that related to specific SRA number. They are usually marked by GSE ids rather than SRA ids. Any suggestions would be appreciated! 

ADD REPLYlink written 5.3 years ago by biorepine1.4k

You can use GEOmetadb to access NCBI GEO information in a similar way as for SRA data and SRAdb.

ADD REPLYlink written 5.3 years ago by Sean Davis25k

Yes but I already downloaded and processed large number of SRA samples. All I want to do is rename them with proper GEOid. I didn't see any information on this in either of the packages :(

ADD REPLYlink written 5.2 years ago by biorepine1.4k

This comes a bit late, but you might want to try something like this:

gse <- getGEO('GSE48138') # retrieves a GEO list set for your SRA id.
## see what is in there:
# There are 2 sets of samples for that ID
##  what you want is table a with SRR to download and some sample information:
## lets see what the first set contains:
df <-[[1]])

The table above contains loads of information regarding the samples/files, IDs, ect. You will have to see what interests you, and use it to rename the files. I hope it helps.

ADD REPLYlink written 4.9 years ago by A. Domingues2.2k

Hello there!

I am trying to extract the following SRA accession numbers with Bioconductor v3.1:



However, by running

getSRAfile(in_acc = c("SRP041432","ERP010058","SRP032486","SRP048789","SRP016517","ERP010240","SRP042345","SRP050383","SRP039499","SRP024388","SRP039009","SRP040131","SRP010723","ERP010570","SRP045342","ERP002340","ERP003677","SRP040950"), sra_con = sra_con,

+            destDir = getwd(), fileType = 'sra', srcType='ftp')


I get error messages due to specific files, which I later confirm are available for download in SRAdownload, for example…

The error message:

trying URL ''

Error in download.file(i, destfile = file.path(destDir, basename(i)),  :

  cannot open URL ''


Am I doing anything wrong?

ADD REPLYlink written 4.4 years ago by massacomgrao0

srafile = getSRAdbFile() trying URL '' Error in download.file(url_sra, destfile = localfile, mode = "wb", method = method) : cannot open URL ''

ADD REPLYlink written 3.1 years ago by kevinchjp10

How to do this in R for controlled access data hosted at dbGaP if we have the key file rather than using prefetch/fastq-dump?

ADD REPLYlink written 2.4 years ago by Ömer An190
gravatar for Federico Giorgi
19 months ago by
Columbia University
Federico Giorgi580 wrote:

A non-R solution is to use the SRA toolkit prefetch command on a list of SRA identifiers.

First you need the file list. You can batch download it. In your case, go to Top-right, click to "Send To", "File", "Accession List".

Once you have it saved in a file (default is SraAccList.txt) you can use the command (tested in SRA toolkit 2.9.0):

prefetch $(<SraAccList.txt)

The .sra files will be downloaded in the default SRA folder. You can change with this trick:

echo '/repository/user/main/public/root = "/path/to/download"' > $HOME/.ncbi/user-settings.mkfg
ADD COMMENTlink written 19 months ago by Federico Giorgi580

This is brilliant! It also works for fastq-dump:

fastq-dump --split-3 --gzip $(</path_to/SRR_Acc_List.txt)
ADD REPLYlink modified 10 months ago • written 19 months ago by ThePresident140
gravatar for vr
23 months ago by
vr10 wrote:

If you have a GSE accession, you can give this a try:

The most important precondition is proper configuration of where you'd like the raw .sra files to be downloaded. You can also set some environment variables (that are mentioned in the command-line help for the tool that will facilitate straightforward use. It can be as simple as something like:

/path/to/ -i [GSE accession]

ADD COMMENTlink modified 23 months ago • written 23 months ago by vr10
gravatar for Ada
3.8 years ago by
Ada0 wrote:

when I run the code on my computer,I have a problem below,what is wrong?



trying URL '' Content type 'application/x-gzip' length 1308358823 bytes (1247.7 Mb) opened URL downloaded 1247.7 Mb


Error in .local(drv, ...) : Could not connect to database: unable to open database file

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Ada0

Perhaps you ran out of space in /tmp or the equivalent. Anyway, please post things like this as new questions.

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by Devon Ryan93k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2384 users visited in the last hour