Question: Downloading individual reads from SRA
0
gravatar for mnmalash
2.0 years ago by
mnmalash0
mnmalash0 wrote:

How can I download selected individual reads from SRA without downloading the whole run file using command line?

ADD COMMENTlink modified 2.0 years ago by vkkodali2.4k • written 2.0 years ago by mnmalash0
3
gravatar for vkkodali
2.0 years ago by
vkkodali2.4k
United States
vkkodali2.4k wrote:

You can probably use fastq-dump to download SRR1803613.479767.1 as follows but I haven't tried paired reads, etc with this.

fastq-dump -A SRR1803613 -N 479767 -X 479767 --fasta
ADD COMMENTlink written 2.0 years ago by vkkodali2.4k

perfect. This is what I want. I will go for this and see what I will get and will try it for paired end reads and tell you. I will work on both.

ADD REPLYlink written 2.0 years ago by mnmalash0

It worked well with single ends and paired ends. However, in paired ends, it downloads both not either of them alone. '-N' and '-X' options do not accept other than numerals. They do not accept the dot (.1 or .2) do you have a way to download either of them when wanted?

ADD REPLYlink written 2.0 years ago by mnmalash0

I got an error message due to the missing parameter for the fasta option. I give a corrected example here: C: How to retrieve an individual read with fastq-dump

ADD REPLYlink written 7 weeks ago by wmorgan4850
0
gravatar for h.mon
2.0 years ago by
h.mon32k
Brazil
h.mon32k wrote:

What do you mean by "download selected individual reads"? How do you know which reads you want to download?

fastq-dump allows streaming the output, so you can combine it with head to download the first whatever number of reads you want. However, there is random access to these files, so you can't access "individual" reads, and there is no way of skipping download a certain number of reads - you can combine head and tail, for example, but you still have to download the reads before discarding them.

ADD COMMENTlink written 2.0 years ago by h.mon32k

I mean downloading individual reads I know their IDs just like this for example SRR123456.4564 and don't want to download the whole SRR123456 run file which may be large and will take time. Provided I want a list of reads from many SRA runs.

ADD REPLYlink written 2.0 years ago by mnmalash0
1

Just out of curiosity, how / why do you know which reads do you want?

ADD REPLYlink written 2.0 years ago by h.mon32k

I have already searched the whole sequences for HMMs of interest and those are the positive reads and their IDs are in the report. I want them separately.

ADD REPLYlink written 2.0 years ago by mnmalash0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1705 users visited in the last hour
_