I'm having trouble processing some sequencing data from an unusual use case. My reads contain a variable region (20bp) followed by a constant region (20bp), and I would like to filter reads based on different maximum edit distances in these regions because the sequencing quality drops off a bit once the constant region is hit due to an underloaded amount of PhiX. I've been using bowtie2 and samtools to try and stitch this together in a two-step process but it hasn't really been working out.
My thoughts were that I could soft-trim first the variable region, filter out any reads not mapping to the constant region within the required edit distance, then (after writing the reads back out to fastq), trim the constant region, align, and filter. Unfortunately, using --local mode with bowtie2 isn't soft-trimming well enough to actually align: when I hard trim to the constant region, I get >98% alignment but using --local results in no reads aligning. And if I hard trim, then I can't re-align to the variable region because they've all been clipped.
Any suggestions? Happy to use any aligner or different software to try and pull this off, been working on it for 4 days and I'm about to do something stupidly dumb and inefficient if I can't figure out a smart way to do this. I've tried a couple of other aligners (bwa aln, star) but have not had success with them either.