GATK BaseRecalibrator known-sites vcf file
1
0
Entering edit mode
5 weeks ago
Jordi • 0

Hi,

I am trying to run GATK's BaseRecalibrator on a BAM file created with the hg19 reference sequence downloaded from UCSC website.

For the --known-sites option I would like to use either a gnomAD .vcf file or a dbSNP .vcf, downloaded from their respective websites.

The analysis works if I use the 00-common_all.vcf file from dbSNP; however this file was created on hg38, and I cannot find the hg19 equivalent on their website.

The analysis does not work, on the other hand, when providing any gnomAD.vcf file; the chromosome nomenclature on the reference does not correspond to the one in the .vcf file, and the following error occurs:

A USER ERROR has occurred: Input files reference and features have incompatible contigs: No overlapping contigs found.
  reference contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chrX, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr20, chrY, chr19, chr22, chr21, chr6_ssto_hap7, chr6_mcf_hap5, chr6_cox_hap2, chr6_mann_hap4, chr6_apd_hap1, chr6_qbl_hap6, chr6_dbb_hap3, chr17_ctg5_hap1, chr4_ctg9_hap1, chr1_gl000192_random, chrUn_gl000225, chr4_gl000194_random, chr4_gl000193_random, chr9_gl000200_random, chrUn_gl000222, chrUn_gl000212, chr7_gl000195_random, chrUn_gl000223, chrUn_gl000224, chrUn_gl000219, chr17_gl000205_random, chrUn_gl000215, chrUn_gl000216, chrUn_gl000217, chr9_gl000199_random, chrUn_gl000211, chrUn_gl000213, chrUn_gl000220, chrUn_gl000218, chr19_gl000209_random, chrUn_gl000221, chrUn_gl000214, chrUn_gl000228, chrUn_gl000227, chr1_gl000191_random, chr19_gl000208_random, chr9_gl000198_random, chr17_gl000204_random, chrUn_gl000233, chrUn_gl000237, chrUn_gl000230, chrUn_gl000242, chrUn_gl000243, chrUn_gl000241, chrUn_gl000236, chrUn_gl000240, chr17_gl000206_random, chrUn_gl000232, chrUn_gl000234, chr11_gl000202_random, chrUn_gl000238, chrUn_gl000244, chrUn_gl000248, chr8_gl000196_random, chrUn_gl000249, chrUn_gl000246, chr17_gl000203_random, chr8_gl000197_random, chrUn_gl000245, chrUn_gl000247, chr9_gl000201_random, chrUn_gl000235, chrUn_gl000239, chr21_gl000210_random, chrUn_gl000231, chrUn_gl000229, chrM, chrUn_gl000226, chr18_gl000207_random, chr1_jh806574_fix, chr1_gl949741_fix, chr1_jh636053_fix, chr1_jh636052_fix, chr1_gl383518_alt, chr1_gl383519_alt, chr1_gl383520_alt, chr1_jh806573_fix, chr1_jh636054_fix, chr1_jh806575_fix, chr1_gl383516_fix, chr1_gl383517_fix, chr2_gl383521_alt, chr2_kb663603_fix, chr2_gl877871_fix, chr2_gl582966_alt, chr2_gl383522_alt, chr2_gl877870_fix, chr3_jh636055_alt, chr3_jh159132_fix, chr3_gl383523_fix, chr3_ke332495_fix, chr3_gl383524_fix, chr3_jh159131_fix, chr3_gl383525_fix, chr3_gl383526_alt, chr4_ke332496_fix, chr4_gl383528_alt, chr4_gl383529_alt, chr4_gl582967_fix, chr4_gl383527_alt, chr4_gl877872_fix, chr5_gl383532_alt, chr5_gl949742_alt, chr5_gl339449_alt, chr5_gl383530_alt, chr5_jh159133_fix, chr5_ke332497_fix, chr5_gl383531_alt, chr6_jh806576_fix, chr6_jh636057_fix, chr6_gl383533_alt, chr6_kb663604_fix, chr6_jh636056_fix, chr6_ke332498_fix, chr6_kb021644_alt, chr7_gl582970_fix, chr7_gl582969_fix, chr7_ke332499_fix, chr7_jh159134_fix, chr7_gl582972_fix, chr7_gl582968_fix, chr7_jh636058_fix, chr7_gl383534_alt, chr7_gl582971_fix, chr8_gl949743_fix, chr8_ke332500_fix, chr8_jh159135_fix, chr8_gl383535_fix, chr8_gl383536_fix, chr9_gl383539_alt, chr9_jh636059_fix, chr9_gl383540_alt, chr9_gl383541_alt, chr9_gl383542_alt, chr9_kb663605_fix, chr9_jh806579_fix, chr9_gl339450_fix, chr9_jh806577_fix, chr9_jh806578_fix, chr9_gl383537_fix, chr9_gl383538_fix, chr10_gl877873_fix, chr10_jh636060_fix, chr10_gl383543_fix, chr10_gl383545_alt, chr10_gl383546_alt, chr10_jh591181_fix, chr10_kb663606_fix, chr10_jh591183_fix, chr10_ke332501_fix, chr10_jh591182_fix, chr10_gl383544_fix, chr10_jh806580_fix, chr11_jh591184_fix, chr11_jh591185_fix, chr11_gl383547_alt, chr11_gl582973_fix, chr11_jh159136_alt, chr11_jh159137_alt, chr11_gl949744_fix, chr11_jh806581_fix, chr11_jh159143_fix, chr11_jh159141_fix, chr11_jh159139_fix, chr11_jh159142_fix, chr11_jh159140_fix, chr11_jh720443_fix, chr11_jh159138_fix, chr12_gl582974_fix, chr12_gl877875_alt, chr12_jh720444_fix, chr12_gl949745_alt, chr12_gl877876_alt, chr12_gl383549_alt, chr12_gl383550_alt, chr12_gl383552_alt, chr12_gl383553_alt, chr12_kb663607_fix, chr12_gl383551_alt, chr12_gl383548_fix, chr13_gl582975_fix, chr14_kb021645_fix, chr15_gl383554_alt, chr15_gl383555_alt, chr15_jh720445_fix, chr16_gl383556_alt, chr16_jh720446_fix, chr16_gl383557_alt, chr17_jh806582_fix, chr17_gl383563_alt, chr17_gl383562_fix, chr17_gl383561_fix, chr17_ke332502_fix, chr17_jh159145_fix, chr17_kb021646_fix, chr17_gl383560_fix, chr17_gl383559_fix, chr17_jh159146_alt, chr17_jh159148_alt, chr17_jh159147_alt, chr17_gl383564_alt, chr17_gl582976_fix, chr17_jh720447_fix, chr17_gl383558_fix, chr17_jh159144_fix, chr17_gl383565_alt, chr17_gl383566_alt, chr17_jh591186_fix, chr17_jh636061_fix, chr18_gl383567_alt, chr18_gl383570_alt, chr18_gl383571_alt, chr18_gl383568_alt, chr18_gl383569_alt, chr18_gl383572_alt, chr19_jh159149_fix, chr19_gl582977_fix, chr19_gl383573_alt, chr19_gl383575_alt, chr19_gl383576_alt, chr19_gl383574_alt, chr19_ke332505_fix, chr19_kb021647_fix, chr19_gl949746_alt, chr19_gl949747_alt, chr19_gl949748_alt, chr19_gl949749_alt, chr19_gl949750_alt, chr19_gl949751_alt, chr19_gl949752_alt, chr19_gl949753_alt, chr20_gl383577_alt, chr20_jh720448_fix, chr20_kb663608_fix, chr20_gl582979_fix, chr21_gl383578_alt, chr21_gl383579_alt, chr21_gl383580_alt, chr21_gl383581_alt, chr21_ke332506_fix, chr22_jh806584_fix, chr22_jh806583_fix, chr22_jh806585_fix, chr22_jh720449_fix, chr22_gl383583_alt, chr22_gl383582_alt, chr22_kb663609_alt, chr22_jh806586_fix, chrX_gl877877_fix, chrX_jh720451_fix, chrX_jh720452_fix, chrX_jh806589_fix, chrX_kb021648_fix, chrX_jh806590_fix, chrX_jh806587_fix, chrX_jh806591_fix, chrX_jh806592_fix, chrX_jh720453_fix, chrX_jh720454_fix, chrX_jh806593_fix, chrX_jh806594_fix, chrX_jh806595_fix, chrX_jh720455_fix, chrX_jh806588_fix, chrX_jh806601_fix, chrX_jh806602_fix, chrX_jh806603_fix, chrX_jh806596_fix, chrX_jh806597_fix, chrX_jh806598_fix, chrX_jh806599_fix, chrX_jh806600_fix, chrX_jh159150_fix, chrMT]
  features contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y]

Does anyone have any input .vcf file from either database correctly formatted to be used for this purpose on hg19? How would you proceed?

Thanks a lot for any input.

gnomad baserecalibrator gatk • 264 views
ADD COMMENT
4
Entering edit mode
5 weeks ago
Ram 34k

You have a good grasp on the exact problems you're facing. Here are a few options for you to consider:

  1. Rename chromosomes in gnomAD using bcftools annotate --rename-chrs
  2. Get an hg19 dbSNP VCF file - it should be available, just dig deeper. EDIT: Took just a few extra seconds of navigating the link you pasted to get to this directory: https://ftp.ncbi.nih.gov/snp/organisms/human_9606_b150_GRCh37p13/VCF/
ADD COMMENT
0
Entering edit mode

The dbSNP file should do the trick. Thanks a lot!

ADD REPLY
0
Entering edit mode

I've moved my comment to an answer. Can you please accept it to mark the post resolved?

ADD REPLY
2
Entering edit mode

It was just I want to know. Thank you, Ram for telling us the information and also thanks Jordi for asking the topics. I faced the same trouble to Jordi's and my trouble was solved by using 00-All.vcf.gz in gatk folder.

ADD REPLY

Login before adding your answer.

Traffic: 1658 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6