How to ignore "GT" from vcf file
1
0
Entering edit mode
3.2 years ago
SUDOsundu ▴ 80

I used medaka_haploid_variant caller to identify variants in my viral reads. After running medaka tools annotate, I got an annotated vcf file. Then filtering by bcftools,

bcftools complaining about GT not defined in the annotated vcf

bcftools filter -Ob -e 'DP<1000' -o filtered.vcf annotated.vcf

I got the following error

FORMAT 'GT' at NC_xxxxx:33 is not defined in the header, assuming Type=String

From the github post I made https://github.com/nanoporetech/medaka/issues/257, I came to that GT does not appear in the VCF header but does occur in the records which is a bug in medaka.

Now, how to run bcftools by ignoring the GT? Is it possible?

bcftools nanopore medaka • 980 views
ADD COMMENT
0
Entering edit mode

easiest would be to add one line to the header. You can do that with any text editor. The other option is to remove the GT fields from the format column, which can be a bit more complicated, depending on your prior experience.

ADD REPLY
2
Entering edit mode
3.2 years ago

fix your VCF header.

 awk '/^#CHROM/ {printf("##FORMAT=<ID=GT,Number=1,Type=String,Description=\"Genotype\">\n");} {print}'  input.vcf
ADD COMMENT

Login before adding your answer.

Traffic: 1510 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6