Hi All.
Currently, my goal is to figure out if crispr/cas approach is inducing mutations besides the desired long deletion in a gene. i compare one wildtype sample vs four clones (ipsc, coming from wt). I used gatk4.6.2 and haplotypecaller. All was fine and well until i looked into the details of some output files.
when i compared haplotypecaller output without -ERC mode and with that flag, i realized by accident that a snv was missing when using -ERC. With IGV i had a manual look and found that this variant look great and was heterozygout, as output by haplotypecaller in normal mode. also the quality was good.
Then with mpileup i tried to figure out if everything look okay.
samtools mpileup -r chr1:93996081-93996081 -f ./Homo_sapiens_assembly38.fasta -q 30 -Q 20 -aa -d 100000 WT.bqsr.bam
[mpileup] 1 samples in 1 input files
chr1 93996081 C 33 .G..g.gG.,,G.g...GG.G.GGGG.,gGGG. mlFlkFkFlGkFaFlFEFQmElkFFEEmkFEFC
eventually, i also tried to call with DeepVariant and was not able to find that variant.
Summed up, I am confused on what to trust and getting trust issues with these tools. I would want to go into -ERC mode and do joint genotyping of all samples before i split again with bcftools view ... into single vcf and go on with annotation, parsing, etc. But my fear is that i lose variants like this, which seem to be obviously present when i have a manual look and were called in single-call mode for every sample, not only wildtype. Would be bad if i lose random variants in an experiment, which should detect off-targets by the knockout approach with crispr/cas.
Any suggestions or help you could share, is appreciated. : ) thanks. cheers.
please, show us an IGV screenshot of the position
GATK realign the reads locally, discards reads with too much clipping, etc... you may not have the same results that with mpileup.