I have been trying to filter out reads from Fastq files from miRNA-Seq that we received. The read structure looks like the one shown in the figure below. I can use Cutadapt to filter out the adapter (we have the adapter sequence) and retain the 15 - 55 sequence using the -m and -M options. Before this filtering step, I want to filter out the common sequence (we know the sequence) and the UMI. I have tried the Seqkit grep option: seqkit grep -rvip ATCTGTAGGCAGGATCAAT s1.fq.gz -o s1.clean.fq.gz, but the cleaned output fastq file almost looks like the input fastq file. It seems I am missing something.
Are there any tools that I can use to remove the common sequence and the UMI before I proceed to trim reads with Cutadapt?