I want to analyse data from a CRISPR/Cas9 screen (control vs. treatment) and I'm using Mageck (https://sourceforge.net/projects/mageck/). Mageck calculates the trim length of reads automatically but in case the trim is variable the program recommends the user to use cutadapt.
I used cutadapt to remove the 5' sequence in front of the gRNA (20 nt) of interest (there are 12 different guides) but some reads with the complete gRNA are not trimmed because there are big deletions. I tried to increase the maximum error rate of cutadapt but the Mageck gRNA count obtained after the trimming is sligthly smaller than the obtained searching each gRNA with grep.
Is it possible to improve the trimming in order to only obtain the gRNAs sequences (the 3' is not necessary to be eliminated)?
This topic is similar to my problem: https://github.com/marcelm/cutadapt/issues/261