I have aligned my samples to a graph vs a linear reference genomes, however, i only get ~20% shared SNPs, and only 70% concordance within this shared SNPs.
The variants from linear reference genome were called using GATK.
This is how I tried mapping the shortreads with the pangenome graph from PGGB. The vg version is vg_v1.64.0
1) VG GIRAFFE. Map the cleaned shortreads to the indexed graph (using vg autoindex).
singularity exec vg_v1.64.0.sif vg giraffe \
-p \
-t 8 \
-Z file.giraffe.gbz \
-d file.dist \
-m file.min \
-f read1.fq.gz -f read2.fq.gz > sample123.gam
2) VG PACK
singularity exec vg_v1.64.0.sif vg pack -x file.giraffe.gbz -g file.gam --min-mapq 10 --threads 8 -o file.pack
3) VG CALL
singularity exec vg_v1.64.0.sif vg call file.giraffe.gbz -k file.pack -p path3 -p path4 --sample sample123 --genotype-snarls --all-snarls --threads 8 > sample123.vcf
bgzip sample123.vcf
4) Filter the variants
bcftools view -f PASS sample123.vcf.gz -Oz -o sample123.PASS.vcf.gz
bcftools view -v snps sample123.PASS.vcf.gz | \
bgzip -c > sample123.PASS.SNPs.vcf.gz
Can you suggest what may have possibly cause the low similarity? Could it be my vgcall script is incorrect?