I would like to split a fastq file into two separate fastq files on a specific position. For example, I have these reads:
@HWI-ST225:523:D1AY5ACXX:8:1101:1566:2149 1:N:0:GTGGCC GCATACCCTCCCTGTCTCAGTTGCTGTTGAAAGAAGAAATCCGCGATATCTTATCCAACCCGCGATATCTTATCCAACGAAGCCAAAACCCTCGCAGTCTG + ??@DDDDDHHHHHIGIHGHICHEGAHC<HHII9?FB3?F4C<FDD>0?9D*99*??D@;AA9=?'''3@@>@A@>::?B?2?-?CC3<A??8?B@B##### @HWI-ST225:523:D1AY5ACXX:8:1101:2000:2191 1:N:0:GTGGCC TTGCTTGTTGCGTGTCTCAGGCGGAAAAACGCCAAAATCAACCGCGATATACATTCCAACCCGCGATATACATTCCAACGTCAGCTCTGAGCTGCTGATCT + @@CFFFFFHHHHFGFIIJJJIIIHHIJJIIGGGGIEHICHAEDEHHHHADDB9?BEDDDBBDDDDBBB>DDC@@C@CCDBBD05>CCDACDDDDDDC:3>>
If you look a bit closer, you can see that there is a pair of linkers in each of the reads. The linkers are:
A - CCGCGATAT CTTA TCCAAC
B - CCGCGATAT ACAT TCCAAC
The linkers differ in only four bases in the middle. I would like to split the fastq file into two, or trim each time either sides of the fastq file exactly between these two linkers (but still keep the quality values for the reads).
Does anyone know a way to do so?