I have converted a paired-end, 454 SRA file (SRR1171018.sra, Argopecten irradians) to FASTQ using fastq-dump.2.3.2
</path/to/fastq-dump/> -F --split-files </path/to/SRR1171018.sra>
_2 being absolutely normal, the entirety of the
_1 file looks like this:
@IE4R6ZA01CKY6V TCAG +IE4R6ZA01CKY6V IIII @IE4R6ZA01EDSKW TCAG +IE4R6ZA01EDSKW IIII @IE4R6ZA01DTY42 TCAG
I initially thought that this may be single-end but incorrectly labelled as paired-end within NCBI, but converting to a single fastq resulted in all reads beginning with TCAG.
I have converted at least 100 sra files in this way in the last 2 months and have never seen this.
- Is this just bad data?
- Could I assemble
_2as if it were single-end to avoid losing the data?