I want to find the coordinates of all occurrences of the sequence recognized by a restriction enzyme. I know that using EMBOSS I may do this, but this task seems perfectly fitted for short-read sequence alignment software. However, I didn't find any reference for this.
I used bwa for the task and quickly obtained some results. However, to be on the safe side I will like to ask is someone has done something similar or has some advice, perhaps I am stretching the use of bwa.
I tried the following:
echo -e ">DpnII\bGATC" > DnpII.fa bwa aln -N -n 0 -o 0 -e 0 -l 4 -k 0 dm3 DpnII.fa > DpnII.sai bwa samse -n 100000000 dm3 DpnII.sai DpnII.fa > DpnII.sam
The results seem to match the right sites, also the number of sites (489570 for Drosophila) that I obtain are close to what I expect.