Question: Is there a paired end joiner which writes also the reads before merging?
Hi, I have a weird question. But I'm looking for a software which not just merge the paired reads but also writes them in a new file.

When a software matches a pair of reads it writes them in the output fastq (or fasta), but you don't have the option of knowing what they paired. I've checked some of them but mostly they create a file with the unpaired reads as the set of the discarded ones. I've been trying to find one which would give an output like that


the first two files would contain the reads that are going to be merged, but have been already found their mate. Is there a software with an option like that? I know it sounds weird but it could help me to discriminate from a pool containing a lot of unwanted sequences from different sources.

thanks for your time.

Am I right that you like to have one with merged reads, and separate files for the forward and reverse reads of the pairs that could be merged?

If so I would first do the merge, extract the read names and use these read names the extract the corresponding forward and reverse reads from the original fastq files.

$ in1=<read1> in2=<read2> out=<merged reads> outu1=<unmerged1> outu2=<unmerged2>
$ seqkit seq -n -i > id_merged.txt
$ seqkit grep -f id_merged.txt <read1> | bgzip -c  > SAMPLE_MATCHED_FORWARD.fastq.gz
$ seqkit grep -f id_merged.txt <read2> | bgzip -c  > SAMPLE_MATCHED_REVERSE.fastq.gz

fin swimmer

