Question: HaplotypeCaller with --dbsnp does not populate ID column
1
gravatar for dmyersturnbull
4.5 years ago by
Stanford University
dmyersturnbull90 wrote:

I want to obtain a VCF file containing genotype calls and their scores for every rsID, whether or not a variant was called. I was planning to use the following steps:

  1. HaplotypeCaller -genotyping_mode DISCOVERY --output_mode EMIT_VARIANTS_ONLY  --emitRefConfidence BP_RESOLUTION as shown above
  2. awk '{ if ( $3 != "." ) { print $0; } }' variants.vcf > variants.filtered.vcf
  3. GenotypeGVCFs --includeNonVariantSites

Using the most recent dbSNP download here: ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/VCF/All.vcf, I ran this:

GATK -T HaplotypeCaller --reference_sequence GCA_000001405.15_GRCh38_no_alt_analysis_set.fna --input_file recalibrated.bam --dbsnp current_dbsnp/All.vcf.gz --genotyping_mode DISCOVERY --output_mode EMIT_VARIANTS_ONLY --emitRefConfidence BP_RESOLUTION --out variants.vcf

However, the ID column only contains ".". How can I get HaplotypeCaller to populate the ID column with rsIDs? Also, is there a better way to get variant and non-variant genotype calls with HaplotypeCaller?

 

ADD COMMENTlink modified 4.5 years ago by Jordan1.1k • written 4.5 years ago by dmyersturnbull90
1
gravatar for Jordan
4.5 years ago by
Jordan1.1k
Pittsburgh
Jordan1.1k wrote:

It looks your reference genome from Ensembl (GRCh38) which used 1-based coordinate system. And the dbSNP file you have used is from NCBI, which uses 0-based coordinate system just like UCSC.

It might be because of that you are not able to find any variants belonging to dbSNP and the id's only show "." ?

ADD COMMENTlink written 4.5 years ago by Jordan1.1k

Thanks.

That's concerning, then. I thought VCF was always 1-based.

However, I don't think that's the issue, since, with BP_RESOLUTION, literally every position is called (chr1:1, chr1:2, chr1:3, ...).

ADD REPLYlink written 4.5 years ago by dmyersturnbull90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 922 users visited in the last hour