Extract columns from VCF file using bcftools preserving all the header info
1
1
Entering edit mode
4.4 years ago

I am trying to extract only the columns I need from VCF preserving VCF structure, its header, its formatting. I am using bcftools. I tried doing:

bcftools annotate -c CHROM,POS,ID,REF,ALT,QUAL,FILTER,INFO/AF,INFO/AC,INFO/AN Holland.vcf -o Holland_selected_cols.vcf

But the output file just stays the same. Then I tried query:

bcftools query -f'[%CHROM\t%POS\t%ID\t%REF\t%ALT\t%QUAL\t%FILTER\t%INFO/AF;%INFO/AC;%INFO/AN\n]' -H Holland.vcf -o Holland_selected_cols.vcf

But it does not preserve VCF header. What would be the right bcftools command for that?

bcftools VCF • 5.7k views
ADD COMMENT
4
Entering edit mode
4.4 years ago

Use bcftools annotate -x to remove all fields, except those you want to keep:

bcftools annotate -x ^INFO/AF,^INFO/AC,^INFO/AN,^FORMAT input.vcf
ADD COMMENT
0
Entering edit mode

It can also be added that if genotypes need to be removed, then a different command bcftools view -G input.vcf > output.vcf can be used. Genotypes are also sort of columns.

ADD REPLY

Login before adding your answer.

Traffic: 3219 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6