Question

Confusion in SRA submission

0

Entering edit mode

6.9 years ago

arunprasanna83 ▴ 60

Hello,

I am downloading an SRA data for a study. The study is basically, Illumina Paired data. However, the study description shows that, "This run has 2 reads per spot" and Reads section show the following:

Reads (separated)

>gnl|SRA|SRR768721.1.1 HWI-ST0798_0099:2:1101:1545:2240 (Biological) CCAGAATGCGCCCGGTGCATTCTGGGACTCCGAATCAGAAGAGGGAGTTGCGTCAGAGGC GGAGGTGGATGAAGCAGCGGGAGGAGAGGCGGAATCATCGG

>gnl|SRA|SRR768721.1.2 HWI-ST0798_0099:2:1101:1545:2240 (Biological) TACACTCGTAACCTCCTCGCCGCCAACCCCGACGTTCTTCAAGAGGGTGGTGCCATTGAC CTAAGCTCAATGTCCAGCNCNNNNNNNNNNNNNNNNNNNNN

Does this mean that, .1 & .2 represent the pairs and I should 'fastq-dump --split-files' to split them into two files ?

Thanks in Advance,

AP

RNA-Seq SRA • 2.1k views

ADD COMMENT • link updated 6.9 years ago by GenoMax 141k • written 6.9 years ago by arunprasanna83 ▴ 60

score 2 · Answer 1 · 2017-05-17

2

Entering edit mode

6.9 years ago

GenoMax 141k

That is correct. I suggest you make your task easy and download the fastq files from ENA here. No sratoolkit needed.

ADD COMMENT • link 6.9 years ago by GenoMax 141k

0

Entering edit mode

Wow. Thats cool ! I didn't know about this !

ADD REPLY • link 6.9 years ago by arunprasanna83 ▴ 60

0

Entering edit mode

Always check ENA to see if fastq's are available. They generally are except for recent SRA submissions (which eventually appear).

ADD REPLY • link 6.9 years ago by GenoMax 141k