How to insert quotes around description of format field in vcf header?
1
0
Entering edit mode
22 months ago
Alewa ▴ 170

I have a badly formatted vcf header. For for reason, the description tag in format field doesn't have quotes.

According to vcf documentation should be like this

##FORMAT=<ID=ID,Number=number,Type=type,Description="description">

how can i put quotes around my description?

thanks

$ bcftools view -h $main_vcf | less

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##ALT=<ID=NON_REF,Description=Represents any possible alternative allele at this location>
##FILTER=<ID=FS60,Description=FS>
##FILTER=<ID=LowQual,Description=Low quality>
##FILTER=<ID=MQ40,Description=MQ < 40.0>>
##FILTER=<ID=QD2,Description=QD < 2.0>>
##FILTER=<ID=QUAL30,Description=QUAL < 30.0>>
##FILTER=<ID=SOR3,Description=SOR>
##FORMAT=<ID=AD,Number=R,Type=Integer,Description=Allelic depths for the ref and alt alleles in the order listed>
##FORMAT=<ID=DP,Number=1,Type=Integer,Description=Approximate read depth (reads with MQ=255 or with bad mates are filtered)>
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description=Genotype Quality>
##FORMAT=<ID=GT,Number=1,Type=String,Description=Genotype>
##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description=Minimum DP observed within the GVCF block>
##FORMAT=<ID=PS,Number=1,Type=Integer,Description=Phasing set (typically the position of the first variant in the set)>
##FORMAT=<ID=SB,Number=4,Type=Integer,Description=Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.>
##GATKCommandLine=<ID=VariantFiltration,CommandLine=VariantFiltration  --output Moldova_SNPs_filtered.vcf --filter-expression QD < 2.0 --filter-expression QUAL < 30.0 --filter-expression SOR > 3.0 --filter-expression FS > 60.0 --filter-expression MQ < 40.0 --filter-name QD2 --filter-name QUAL30 --filter-name SOR3 --filter-name FS60 --filter-name MQ40 --variant Moldova_SNPs.vcf  --cluster-size 3 --cluster-window-size 0 --mask-extension 0 --mask-name Mask --filter-not-in-mask false --missing-values-evaluate-as-failing false --invalidate-previous-filters false --invert-filter-expression false --invert-genotype-filter-expression false --set-filtered-genotype-to-no-call false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays  --disable-tool-default-read-filters false,Version=4.1.2.0,Date=January 20, 2021 4:12:58 PM PST>>
##INFO=<ID=AN,Number=1,Type=Integer,Description=Total number of alleles in called genotypes>
##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description=Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities>
##INFO=<ID=DP,Number=1,Type=Integer,Description=Approximate read depth; some reads may have been filtered>
##INFO=<ID=DS,Number=0,Type=Flag,Description=Were any of the samples downsampled?>
##INFO=<ID=END,Number=1,Type=Integer,Description=Stop position of the interval>
##INFO=<ID=ExcessHet,Number=1,Type=Float,Description=Phred-scaled p-value for exact test of excess heterozygosity>
##INFO=<ID=FS,Number=1,Type=Float,Description=Phred-scaled p-value using Fisher's exact test to detect strand bias>
##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description=Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation>
##INFO=<ID=MQ,Number=1,Type=Float,Description=RMS Mapping Quality>
vcf bcftools • 570 views
ADD COMMENT
2
Entering edit mode
22 months ago
 sed '/^#/s/Description=\(.*\)>$/Description="\1">/'  in.vcf
ADD COMMENT
0
Entering edit mode

thanks, that helps!

ADD REPLY

Login before adding your answer.

Traffic: 1860 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6