I hope somebody can help me. I am following a pipeline to call snps from fastq files. I have successfully performed the alignment to hg38 and mks duplicates in my bam file with picard. However, now I am in the step in which I am using GATK to call variants. I know GATK requires a dbSNP file to use as a reference. I have downloaded the latest dbSNP release (dbSNP154 v2) from this website: https://ftp.ncbi.nih.gov/snp/latest_release/VCF/. The chromosomes were named differently in that latest version. So I looked at the assembly report, extracted the columns, and renamed the chromosomes using bcftools annotate --rename-chrs. I had to reorder the rows in this new file using bcf sort, because using tabix to index the file was giving me an error. However, after all this steps, when running GATK on my sample using this dbSNP154vs release I get the following error: htsjdk.samtools.SAMException: Sequence name '' doesn't match regex: '[0-9A-Za-z!#$%&+./:;?@^_|~-][0-9A-Za-z!#$%&+./:;=?@^_|~-]'
If I run GATK on an older version of the dbSNP(like 151), it works perfectly fine. Any ideas on how can I run GATK using the dbSNP154v2 for known sites. Thanks!