I have some protein X with a known protein sequence which has both its cDNA and genomic sequence(incl. introns and UTRs) on genbank.
I have used random mutagenesis to create random mutations in the genomic sequences of protein X. I wish to use CRISPR Cas9 to target these mutations. However, Cas9 can only target codons which are up to 21 nt 5 prime from a GG (PAM) motif or 21 nt 3 prime from a CC motif in the genome. If I want to maximise specificity, then the range is only 11 nt 5 or 3 prime respectively. Hence I am aware that some residues`codons may be too far away from the PAM site to mutate using Cas9.
The problem is I dont have any idea how to determine which residues whose codons are too far away (greater than 21 or 11 nt) from the PAM motif! If the DNA sequence contains only cDNA, then I have some idea, but to complicate things the genomic sequences contains UTRs and introns, and they may contain PAM sites as well!
How should I go about trying to solve my problem?
Thanks for the Help!