GATK ver. : 220.127.116.11 Picard ver. : 2.21.4 samtools ver. : 1.10
I'm learning to create a pipeline for variant calling. I started with an arbritrary chosen exome from 1000genomes in form of two FASTQ files.
I pre-processed the data using the GATK Best Practice workflow https://gatk.broadinstitute.org/hc/en-us/articles/360035535912-Data-pre-processing-for-variant-discovery
And ended-up with a supposedly "analysis-ready" bam file.
Since 1000genomes also provides a .cram file (aswell as a .cram.crai and a .bam.bas). How would I be able to compare my file with what is provided? I converted the .cram into a .bam file and I'm looking for a way to compare the two.
Next, for the variant calling, 1000genomes provides a .vcf file for each chromosome. How can I know wich type of variant calling was done? (SNP, SNV, Indels, CNV, ... ) Would I be able to check the validity of my .vcf result?
Any help would be appreciated, don't hesitate to ask for more informations.
Thank you in advance,