Question

fastq-dump: failure to get two fastq files

0

Entering edit mode

5.2 years ago

msrk04011 • 0

Hello,

I have being trying to get fastq files from sra data of SRR1030614. This is registered as paired-end, so I tried as follows.

$ fastq-dump --split-files SRR1030614

As the result, I got SRR1030614_2.fastq, but not SRR1030614_1.fastq. In addition, I got the following message from fastq-dump program:

Rejected 27168787 READS because READLEN < 1
Read 27168787 spots for SRR1030614.sra
Written 27168787 spots for SRR1030614.sra

When I checked the entry SRR1030614 on NCBI SRA, in the "Reads" tab I see the read data such as

Reads (separated)
>gnl|SRA|SRR1030614.1.1 1 (Technical)
Empty read
>gnl|SRA|SRR1030614.1.2 1 (Biological)
CTGATCCGAACATTGTGTACATGACCATTTCGATGATGTACAGTACAATCGTCACATAGA
AGATAACCCGCCACGCGCTAATTGTTTGGTTGCCGTGTGTG

So maybe I cannot get the SRR1030614_1.fastq file because it is empty? Also, if I cannot have two separate fastq files, is it ok to run the downstream analysis (e.g. trinity) specifying the read file is single-ended? Any comments will be much appreciated. Thank you.

RNA-Seq • 5.1k views

ADD COMMENT • link 5.2 years ago by msrk04011 • 0

0

Entering edit mode

Thank you so much for your suggestion. As you told, for the accession only one file has been provided at ENA, and it gives the same result when subjected to fastq-dump. I'll check with the authors, but meanwhile I am going to try performing Trinity specifying it as single-ended.

ADD REPLY • link 5.2 years ago by msrk04011 • 0

score 1 · Answer 1 · 2019-02-23

When checking on SRA the first read is indeed listed with read length 0. I would guess that this was a single-end run and something went wrong when uploading data (just a guess) incorrectly labelling it as paired. You'll probably have to live with this. If you double check at the ENA, they only provide one file. You can still try contacting the authors and ask for clarification.