I have to analyze paired-end RNA-seq read that are in an unusual format: both pair-end reads are joined in one FASTQ. I need to split the file in two separated FASTQ paried-end files.
There are a galaxy tool named FASTQ splitter that can do this:
What it does
Splits a single fastq dataset representing paired-end run into two datasets (one for each end). This tool works only for datasets where both ends have the same length.
Sequence identifiers will have /1 or /2 appended for the split left-hand and right-hand reads, respectively.
A multiple-fastq file, for example:
@HWI-EAS91_1_30788AAXX:7:21:1542:1758 GTCAATTGTACTGGTCAATACTAAAAGAATAGGATCGCTCCTAGCATCTGGAGTCTCTATCACCTGAGCCCA +HWI-EAS91_1_30788AAXX:7:21:1542:1758 hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh`hfhhVZSWehR
@HWI-EAS91_1_30788AAXX:7:21:1542:1758/1 GTCAATTGTACTGGTCAATACTAAAAGAATAGGATC +HWI-EAS91_1_30788AAXX:7:21:1542:1758/1 hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
@HWI-EAS91_1_30788AAXX:7:21:1542:1758/2 GCTCCTAGCATCTGGAGTCTCTATCACCTGAGCCCA +HWI-EAS91_1_30788AAXX:7:21:1542:1758/2 hhhhhhhhhhhhhhhhhhhhhhhh`hfhhVZSWehR
Do you know any other standard alone script that can do this job?