3.9 years ago

Hi I want to bulk download all fastq files in a study from ENA http://www.ebi.ac.uk/ena/data/view/PRJEB3334

It is possible to click to the experiment and then download them but this would be very slow... any way to bulk download?

3.9 years ago
GenoMax 104k

You can click on "Bulk download files" button on the page you linked above. Alternatively, you can click on "TEXT" button and then open the file that gets downloaded. Links for all fastq files from the dataset are in this file.

thanks - sorry what I actually mean to as was how to get all of those text files for all studies for this link: http://www.ebi.ac.uk/ena/data/view/PRJNA235852

I know how to go from there using wget with the text file.

Unless I am misinterpreting the statement above the second part of my answer has that info. 10th column in the file that gets downloaded using the TEXT link has the ftp links that you can use with wget. You can get them with a simple for bash loop or via multiple jobs, if you are using a cluster.

It seems as though your project is actually a "Parent project" with many components. So, while you can get texts for them individually, but not all at once. I don't have an exact fix but in my experimenting with wget while trying to familiarize myself with FTP I have a makeshift answer that may work. But, honestly, it may be easier to just download each individually, as this will require some parsing on top of the download.

Concisely, I looked at 2 urls from different projects and found what they cease to have in common. Based on that you can get an FTP directory and get creative with wget. Based on that, this may work

wget -r -l2 -A.txt ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR116/

(r is recursive, l3 tells it to go as far as 2 folders deep,and -A tells it what file type to look for) If you ultimately just want all the fastq.gz for this, I think you could also do this:

wget -r -l2 -A .gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR116/

I'll repeat that I mostly found this while looking for solutions myself, and there is almost definitely a quicker way, but this seemed to work