Hi, i have two fastq files for each of my samples which each contain about 1M reads (one file for forward and one for reverse reads). I have the demultiplexing and cutting of adapters done, so the reads are only genomic sequences. The reads are quite short (~100bp) and cover the same genomic sequence (ultra deep sequencing). Forward and Reverse read overlap for the full 100bp. I would like to merge the forward and reverse reads into one consensus sequence, such that I am left with only one fastq file. Reads should have Ns at positions where forward and reverse read did not match.
I know there are a lot of programs like bbmerge.sh, pear etc. but none of them seem to be able to do exactly what I want, which is only keep matching bases and have the rest as Ns.
Maybe someone knows a tool that does this, if not I'll probably align both files seperately with bwa and then go through the reads with a python script and do the matching myself.
Hope my explanation was clear enough, thank you in advance.