Question: how to add reference alleles to VCF?
0
gravatar for dec986
10 weeks ago by
dec986230
United States
dec986230 wrote:

I’m converting gVCFs to VCF, but the reference alleles are missing. An example below:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  180525_FD02929177
1   97547947    .   T   .   .   .   DP=31   GT:DP:RGQ   0/0:31:81
1   97915614    .   C   .   .   .   DP=40   GT:DP:RGQ   0/0:40:99
1   97981343    .   A   .   .   .   DP=43   GT:DP:RGQ   0/0:43:99
2   234668570   .   C   T   539.64  .   AC=1;AF=0.500;AN=2;ClippingRankSum=0.340;
DP=32;ExcessHet=3.0103;FS=5.748;MLEAC=1;MLEAF=0.500;MQ=60.00;QD=16.86;RAW_MQ=115200.00;SOR=0.150    G
T:AD:DP:GQ:PL   0/1:17,15:32:99:547,0,586
2   234669144   .   G   .   .   .   DP=36   GT:DP:RGQ   0/0:36:99

which was made by break_blocks:

break_blocks --region-file /illumina/runs/con/concordance/fluidigm/fluidigm_positions.tab.bed --ref human_g1k_v37.fasta --exclude-off-target

I’m using GATK thus:

gatk --java-options "-Xmx4g" GenotypeGVCFs \
     -R /illumina/runs/con/g1k_v37/human_g1k_v37.fasta \
     -V fluidigm.gvcf.202009/HG00099.fluidigm.202009.g.vcf \
     -O fluidigm.vcf.202009/HG00099.fluidigm.202009.vcf \
     --allow-old-rms-mapping-quality-annotation-data \
     --include-non-variant-sites

But none of the options in GATK seem to allow adding reference alleles to the REF column, everything is just .. When I try this manually with a Perl script, there are missing data, so programming it myself can’t work.

Do you know how I can add the reference alleles to VCF/gVCF?

genome vcf • 199 views
ADD COMMENTlink modified 10 weeks ago by _r_am31k • written 10 weeks ago by dec986230
1

Please do not delete a question after it has been addressed in some way. Eyeballing columns wrong is a common problem and someone else could benefit from your experience.

Please accept my answer below using the green check mark on the left.

Upvote|Bookmark|Accept

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by _r_am31k
3
gravatar for _r_am
10 weeks ago by
_r_am31k
Baylor College of Medicine, Houston, TX
_r_am31k wrote:

I don't see any entry with a missing REF. Could it be that you're visually matching the ID column in the header to the REF column in the data?

See below:

#CHROM  POS        ID  REF  ALT  QUAL    FILTER  INFO               FORMAT          180525_FD02929177
1       97547947   .   T    .    .       .       DP=31              GT:DP:RGQ       0/0:31:81
1       97915614   .   C    .    .       .       DP=40              GT:DP:RGQ       0/0:40:99
1       97981343   .   A    .    .       .       DP=43              GT:DP:RGQ       0/0:43:99
2       234668570  .   C    T    539.64  .       AC=1;AF=0.500;...  GT:AD:DP:GQ:PL  0/1:17,15:32:99:547,0,586
2       234669144  .   G    .    .       .       DP=36              GT:DP:RGQ       0/0:36:99
ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by _r_am31k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1248 users visited in the last hour