Hello, I used Pear to merge my Illumina (MiSeq sequencer) pair-end reads. I'd like to know the overlap length in each of the merged reads in order to calculate min length, max, average, mode, median etc... Can you suggest my how to do? I tried with a perl script (length R1 + length R2 - length MergedR1R2) but I am not very good in programming.... Can anybody help me and tell me how to do this? Thanks!!
can you post an example of input file, pls?
I have 3 files in fastq format. R1.fastq, R2.fastq, pear_R1R2.assembled.fastq.
This is an example of what I am talking about. I show only the sequence (2nd line in the fastq file) R1 A G T G C A C T A G T G G T R2 C A C T A G T G G T A T A A Pear R1/R2 A G T G C A C T A G T G G T A T A A I want to extract the length of the overlap region C A C T A G T G G T ( 10 in this case) , for every read contained in the pear_R1R2.assembled.fastq file.
Thank you very much!