Specifically the genome sequence has 'chr' and also unplaced contigs but the SNP vcf file does not. I am wondering if I can simply append 'chr' into the SNP file (assuming that unplaced contigs are included) or if there is a SNP file (that has indels included) for Ensembl genome (ideally for both GRCh38 and GRCh37).
EDIT: Upon further inspection, the SNP vcf file with 'papu' notation included has unplaced contigs, but this still does not include 'chr' notation. I also found that Ensembl has its own dbSNP (version 144) that corresponds to Ensembl 83 (GRCh38) but I do not see a download link. I also see that UCSC adopted the Gencode/Ensembl format but their SNP does not include ones for unplaced contigs. First, I am wondering if this matters for the purpose of running GATK, and, second, is it possible to merge common, clinically associated, and multimapped variants into 1 vcf? Is this advisable?
ERROR /00-All.vcf contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y, MT] ERROR reference contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM, GL000008.2, GL000009.2, GL000194.1, GL000195.1, GL000205.2, GL000208.1, GL000213.1, GL000214.1, GL000216.2, GL000218.1, GL000219.1, GL000220.1, GL000221.1, GL000224.1, GL000225.1, GL000226.1, KN538364.1, KQ031383.1, KN538369.1, JH159136.1, JH159137.1, KQ031387.1, KN538360.1, KN196484.1, KN196476.1, KN196479.1, KN196473.1, KN196487.1, KN196475.1, KQ090016.1, KN538361.1, KN196474.1, KQ090022.1, KN196478.1, KN196480.1, KQ090028.1, KN196483.1, KN196481.1, KN538363.1, KN538362.1, KQ031385.1, KQ031386.1, KQ031388.1, KN538365.1, KN538366.1, KN538367.1, KN538370.1, KN538373.1, KN538371.1, KQ031384.1, KN538372.1, KQ090021.1, KN196482.1, KQ458386.1, KN196472.1, GL383545.1, GL383546.1, KI270824.1, KI270825.1, KQ090020.1, GL383547.1, KN538368.1, KI270826.1, KI270827.1, KI270829.1, KI270830.1, KI270831.1, KI270832.1, KI270902.1, KI270903.1, KI270927.1, GL877875.1, GL383549.1, GL383550.2, KQ090023.1, GL877876.1, GL383552.1, KI270904.1, GL383553.2, KI270835.1, GL383551.1, KI270837.1, KI270833.1, KI270834.1, KI270836.1, KI270838.1, KI270839.1, KI270840.1, KI270841.1, KI270842.1, KI270843.1, KQ090024.1, KQ090025.1, KI270844.1, KI270845.1, KI270846.1, KI270847.1, KI270852.1, KI270848.1, GL383554.1, KI270906.1, GL383555.2, KI270851.1, KI270849.1, KI270905.1, KI270850.1, KQ031389.1, KI270853.1, GL383556.1, GL383557.1, KI270855.1, KQ031390.1, KI270856.1, KQ090027.1, KQ090026.1, KI270854.1, KI270909.1, GL383563.3, KI270861.1, GL383564.2, GL000258.2, KI270860.1, KI270907.1, KI270862.1, ... ...
(contracted to meet character limit)