I have the fastq files for some miRNA libraries prepared with the QIAseq miRNA Library Kit. I have to do the UMI extraction, but the problem is that the UMI is after a common sequence for all the reads, such as this:
Where the N are the miRNA sequences, the bold part is the common sequence for all the reads and the part with all the X is the part with the UMI sequence.
How could I remove the bold part and append the UMI to the header of the fastq file? The problem is that I have seen that around 3-5% of the reads don't have the common sequence, I suppose that there are sequencing errors and some part of this sequence is changed in some reads, but I don't know how to accept one letter change in the common part.
Thank you very much!