I'm looking for a fast tool to scan hight throughput sequencing data reads for patterns, so that the final thorough analysis can be performed with only a few reads that belong to the cluster.
Practicaly, I manually generated a set of oligos (~20mer) that uniquely belong to a cluster of genes, and now I'm going to find all the reads that are matching. Next step is finding neighboring reads, but already the fist step programmed based on a python find regex string routine takes weeks on whole genome seq.