I am designing some bioinformatics software, but have little working experience with FASTQ data.
The data I wish to compute over is paired end data. From which I understand consists of "mate sequences", namely left and right mates, which correspond to sequencing the same region of the genome, in the reverse and forward orientation.
My question is about how this data is returned back to the user after sequencing. Is the researcher given separate files containing only forward or reverse orientation sequences? Or is the data mixed together.
This basically comes down to how I process data in the software. If it is the case that separate orientations are given separate files, then the I can allow the user to specify the orientation at the command line; otherwise, I will have to read every sequence id to determine the orientation.
Kind regards, Izaak