Hi guys!
Our team recently used crispr to delete 45 bases (15 amino acids) from the DNA of a particular protein. To make sure the deletion was done, we ran a 135-base PCR product that contains the sequence we delete + bases from the beginning and end.
This is the first time I work with DNA sequences. Is there a specific protocol I have to work by?
When I do alignment to the unprocessed files, I got that there is a 6-base deletion instead of a 45-base deletion. When I clear sequences and filter, I get a very small number of short sequences.
In addition, we also ran the product of the PCR in gel and there we saw that there is a difference between the WT samples and the DEL samples (the DEL bands were above the WT bands).
Does anyone have an idea?
Thank you!
How did you align the sequences? Be sure to use a global aligner such as EMBOSS-Needle which performs end-to-end alignments rather than something like BLAST which is not meant to capture longer deletions and favours local alignments as long as it returns a good alignment score.
Example:
Say the top was the WT sequence and the bottom was the sequence after deletion:
EMBOSS-Needle gets it correctly:
Whereas BLAST:
The alignment score of this alignment is simply higher than of the full-length sequence with all the deletions which create penalty scores. Hence, BLAST is not suitable here.
Thank you for the detailed answer! I used 2 methods STAR and bowtie2 ( I did the alignment twice to make sure). For visualization, I used IGV.
Just to confirm, this is Sanger sequencing data you have, right? I would just give the EM a try, the nice thing is that you can really see by-eye how the actual alignment looks base-to-base.
No it's not a Sanger sequencing data, We did it by NGS (Illumina).