Downloading PE reads from SRA but reads are not split
1
0
Entering edit mode
2.6 years ago
Michael 54k

I am trying to download some Placozoan RNA-seq reads to assemble them from SRA(SRR8193747 SRR8193748 SRR8193749) using fastq-dump:

 nohup fastq-dump --split-3 SRR8193748 SRR8193747 SRR8193749 &

According to SRA the reads are paired. Therefore I am expecting to get *_1.fastq and *_2.fastq files (and maybe some unpaired as well) like always. However, in this case I am getting only one file per run: SRR8193747.fastq SRR8193748.fastq SRR8193749.fastq.

The output from fastq-dump says:

Rejected 5037041 READS because of filtering out non-biological READS
Read 5037041 spots for SRR8193748
Written 5037041 spots for SRR8193748
Rejected 5157059 READS because of filtering out non-biological READS
Read 5157059 spots for SRR8193747
Written 5157059 spots for SRR8193747
Read 8889366 spots for SRR8193749
Written 8889366 spots for SRR8193749

Is there something wrong with the SRA files or with my setup?

sra fastq-dump rna-seq • 1.1k views
ADD COMMENT
0
Entering edit mode
2.6 years ago

Weird thing indeed. When checking the metadata and such for one of them it seems the length of read2 is 0 ??? (which could explain the behavior you're seeing but not why it's like that ) https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR8193748

One thing I could think of is that they actually had single end data but submitted it as paired end ?

ADD COMMENT
0
Entering edit mode

Yes, that may be it. I found the publication and it says somewhere in the methods: ... for the haplotype H4 RNA libraries 32 – 37 million single 150 bp reads were obtained. I guess that means single end.

ADD REPLY
0
Entering edit mode

does so in my book ;)

ADD REPLY

Login before adding your answer.

Traffic: 2519 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6