Getting multiple *.sra files from NCBI using a list
0
0
Entering edit mode
6.8 years ago
chrys ▴ 60

Hi there folks, I am trying to download SRA files from a dataset I compiled from the NIH Roadmap Data Matrix. The problem is that the dataset spans about 50-60 files. I have little intend of downloading them by hand and I thought a quick wget would help but for some reason each link provided just holds another subdirectory and does not point to the locations of the actual *.sra files which makes it a lot harder to download them:

List:

Sample1 ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByExp/sra/SRX/SRX099/SRX099571
Sample2 ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByExp/sra/SRX/SRX040/SRX040594

My script so far does this:

while read name files; do
    mkdir $name
    wget $files -P /$name/

done < List

I tried some different approaches like (and many others):

wget --no-parent -r -l1 ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByExp/sra/SRX/SRX099/SRX099571/

My list just contains my sample names and the links which were provided by the data matrix. But wget seem to have problems accessing the subdirectory with the *.sra file without specifying the path explicitly.

If anybody has an idea on how to solve this I would be eternally grateful. Since at this time I probably would have been done downloading them by hand.

sra NCBI • 3.2k views
ADD COMMENT
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Well, it would work but I would still need to acquire the explicit identifier of every experiment since the GEO accession does not work and the compression of *.sra is quite high since I do not need them all at once but iteratively. Also unfortunately, the NIH Roadmap does not let me export the direct *.sra identifiers but only the mentioned subdirectories. Meaning that I would still look up the sra-IDs by hand correct ? Sorry if I overlooked something terribly obvious.

Short example: https://www.ncbi.nlm.nih.gov/geo/roadmap/epigenomics/?view=samples&sample=CD14%20primary%20cells

Then I hit export, take the file and create my list from this. If I am doing something stupid please let me know. Thank you for your help.

ADD REPLY
0
Entering edit mode

See if getting them from EBI-ENA is less painful. You could get fastq files directly avoiding sratoolkit altogether.

ADD REPLY
0
Entering edit mode

If the SRX files are in order, you could print all the wget commands through a loop and run them in the terminal

ADD REPLY

Login before adding your answer.

Traffic: 1878 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6