Question: Dbsnp Rod File Still Useful?
mylons130 wrote:

I've been trying to find the documentation on generating a dbSNP.rod file that used to be in the GATK's re-aligning workflows and documented in the GATK wiki, but due to their new licensing they've taken that site down.

I also stumbled upon this, , which seems to imply they're not necessary anymore.

Can someone straighten me out?

Liye Zhang80 wrote:


  the rod file is used in older version of GATK, therefore, if you are using the current version of GATK (version 1.6, I haven't tried their 2.0 version yet, so I do not know about 2.0 version).
  I do not think you need to use the rod file. Instead, they just use the vcf format, as long as your vcf formats have the same chromosome order as your bam file. Just give you an example on Unified Genotyper. (--dbsnp dbSNP.vcf replaces the rod file).

java -jar GenomeAnalysisTK.jar \ -R resources/Homosapiensassembly18.fasta \ -T UnifiedGenotyper \ -I sample1.bam [-I sample2.bam ...] \ --dbsnp dbSNP.vcf \ -o snps.raw.vcf \ -standcallconf [50.0] \ -standemitconf 10.0 \ -dcov [50] \ [-L targets.interval_list]

Hope this will clarify your questions.

Yes it did. I also finally found this documentation on the portion I was about to use:

They state this in there: --known / -known (List[RodBinding[VariantContext]] with default value [])

Input VCF file with known indels. Any number of VCF files representing known SNPs and/or indels. Could be e.g. dbSNP and/or official 1000 Genomes indel calls. SNPs in these files will be ignored unless the --mismatchFraction argument is used. --known binds reference ordered data. This argument supports ROD files of the following types: VCF, VCF3

I didn't realize a ROD was a vcf. That was essentially my mistake. Thanks!

