Question

Help downloading original data from SRA

1

Entering edit mode

4.0 years ago

stefan.grabuschnig ▴ 10

Hi biostars community! I am trying to download the original fastq paired-end read files from three biosamples of a bioproject from SRA. For example: https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR8581464

There are 4 original fastq files. Trying to download with SRA toolkit prefetch command "prefetch --type fastq SRR8581464" results in the following error:

"prefetch.2.10.5 err: error unexpected while resolving query within virtual file system module - failed to resolve accession 'SRR8581464' - The object is not available from your location. ( 406 )"

without the --type switch it downloads the sra file, which does not help me because the original files contain reads which I want to analyse separately (these are reads from DNA containing extracellular vesicles of different densities).

I think the original files can only be downloaded from amazon/google cloud, but I don't understand how to do so. The ncbi documentation only describes how to generate a cloud VM and install their toolkit there, which I don't want to. I just want those files on my machine… I am astonished how they managed to make something so simple as downloading files that complicated.

Thank you very much in advance for you help. Cheers! Stefan

sequence SRA SRA-Toolkit software error • 6.6k views

ADD COMMENT • link 4.0 years ago by stefan.grabuschnig ▴ 10

0

Entering edit mode

Thanks! This already helped a lot. Still there should be 4 files for this experiment (L003 R1 and R2 plus L007 R1 and R2). Maybe I can split them somehow via read names. Cheers! Stefan

stq 8,755,578 Kb AWS s3://sra-pub-src-3/SRR8581464/P5514_102_S23_L003_R1_001.fastq s3.us-east-1 aws identity GCP gs://sra-pub-src-3/SRR8581464/P5514_102_S23_L003_R1_001.fastq gs.US gcp identity fastq 8,755,578 Kb AWS s3://sra-pub-src-3/SRR8581464/P5514_102_S23_L003_R2_001.fastq s3.us-east-1 aws identity GCP gs://sra-pub-src-3/SRR8581464/P5514_102_S23_L003_R2_001.fastq gs.US gcp identity fastq 44,537,070 Kb AWS s3://sra-pub-src-3/SRR8581464/P5514_102_S77_L007_R1_001.fastq s3.us-east-1 aws identity GCP gs://sra-pub-src-3/SRR8581464/P5514_102_S77_L007_R1_001.fastq gs.US gcp identity fastq 44,537,070 Kb AWS s3://sra-pub-src-3/SRR8581464/P5514_102_S77_L007_R2_001.fastq s3.us-east-1 aws identity GCP gs://sra-pub-src-3/SRR8581464/P5514_102_S77_L007_R2_001.fastq gs.US gcp identity

ADD REPLY • link 4.0 years ago by stefan.grabuschnig ▴ 10

0

Entering edit mode

If these are technical replicates (on two lanes) then you don't need to split the file into separate lanes.

ADD REPLY • link 4.0 years ago by GenoMax 141k

0

Entering edit mode

Hi! I am not entirely sure. In their paper they say that they produced several fractions (F1 to F7) of vesicles based on their density, where DNA cargo varied between high and low density fractions. I assume these L003 and L007 files are from the respective fraction. I need to check if this is documented somewhere. Thank you very much for your help! Cheers! Stefan

ADD REPLY • link 4.0 years ago by stefan.grabuschnig ▴ 10

score 1 · Answer 1 · 2020-04-29

1

Entering edit mode

4.0 years ago

GenoMax 141k

Get the fastq files from EBI-ENA for SRR8581464. Click on Show in "Read Files" in top right column.

ADD COMMENT • link 4.0 years ago by GenoMax 141k