I want to download a list of around 2000 RNA-sequence files. My approach of doing it is very inefficient:
- I go to https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=search_obj
- Type in the name of experiment. Fx. SRR1186053.
- Then I click my self into: https://trace.ncbi.nlm.nih.gov/Traces/sra/view=search_seq_name&exp=SRX483399&run=&m=search&s=seq, where I click download on the FASTA file (because I don't know whether to need FASTQ instead...)
- After the file is downloaded, I filter it from the metadata (time, length, etc.) and concatenate all the smaller sequence to one big sequence. This is done in a simple C++ program I have built.
- At last, I get all k-mers from this sequence, and add them to a data structure I have built.
I would love to know if there is an easier way to do this. How do I make one bulk download?
Example of a list:
- blood SRR1186053 8399571400 5850
- blood SRR1186053 8399571400 5850
- blood SRR805782 724114625 412
- blood SRR837459 1313374050 841 ...