Question: Adding An Extra Column To A Vcf File.
gravatar for Pierre Lindenbaum
10.3 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

Hi all, I've just written a tool adding one or more extra column in a VCF file. The header now looks like this:

#CHROM    POS    ID    REF    ALT    QUAL    FILTER    INFO    MY_COL1    MY_COL2    FORMAT    NA00001    NA00002    NA00003

Is there something in the VCF spec saying that another column can't be added ? because when I used VCFTOOLS, it says:

vcftools --vcf file.vcf 
Scanning file.vcf ... 
Ninth Header entry should be FORMAT: MY_COL1
Currently scanning CHROM: 19
Currently scanning CHROM: 20
Currently scanning CHROM: X
sequencing next-gen format vcf • 5.3k views
ADD COMMENTlink modified 6 months ago by Biostar ♦♦ 20 • written 10.3 years ago by Pierre Lindenbaum129k

I fixed this problem by creating a new file format :-)

ADD REPLYlink modified 23 months ago by RamRS28k • written 10.2 years ago by Pierre Lindenbaum129k
gravatar for Aaronquinlan
10.1 years ago by
United States
Aaronquinlan11k wrote:

BEDTools now supports VCF and you can tack on any number of columns you want. That said, if you are looking for specific functionality within VCFTools, then this isn't helpful at all.

ADD COMMENTlink written 10.1 years ago by Aaronquinlan11k
gravatar for lh3
9.6 years ago by
United States
lh332k wrote:

Instead of inserting new columns which will screw up most tools, you should add your custom information at the ANNO column. This is what that field is designed for. With perl, it is very easy to extract the key-value pair there, e.g.:

perl -ane 'print "MYKEY=$1\n" if $F[7]=~/MYKEY=([^;]+)/'

Furthermore, VCF is not only used for SNPs, but also for INDELs and SVs. To make this format, various people from several major sequencing centers have joined the discussion. In my opinion, it is quite stable now. Small details may be changed in future, but not the number of columns.

ADD COMMENTlink modified 23 months ago by RamRS28k • written 9.6 years ago by lh332k
gravatar for Giovanni M Dall'Olio
10.3 years ago by
London, UK
Giovanni M Dall'Olio27k wrote:

Which kind of information do you want to add? I don't think that the VCF specifications allow to add new columns, but since this format is still in an early phase of development, you could contact the authors and propose them a new functionality.

However, VCF files should be used only to describe the SNPs and their genotypes, and any other kind of information should go somewhere else... for example, if you have statistics associated with a snp, you should consider a flat file or a database.

ADD COMMENTlink written 10.3 years ago by Giovanni M Dall'Olio27k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1184 users visited in the last hour