Question: BQSR Recalibration for Mycobacterium Genome Data
Hi, We are trying to run an SNP Analysis on Mtb whole genome. Currently, we are stuck at the Base Recalibration Step. It says that we require a DbSNP file to generate the BQSR.table for Applying BQSR. The DbSNP file for Mycobacterium tuberculosis does not exist.

Any advice on how to move forward with the Base Recalibration without the DBSNP would be greatly appreciated. Thanks!

According to what the gatk forum said,you can bootstrap a database of known SNPs. Here's how it works:

First do an initial round of SNP calling on your original, unrecalibrated data. Then take the SNPs that you have the highest confidence in and use that set as the database of known SNPs by feeding it as a VCF file to the base quality score recalibrator. Finally, do a real round of SNP calling with the recalibrated data. These steps could be repeated several times until convergence.

the site will help you

Thank you! This seems like it should work

