Dear all,
I have a problem at hand regarding the manipulation of multiple VCF files (containing the same variants and referred to the same sample) so as to merge their INFO
fields..
The context.
Say I have the following VCF file (headers not included):
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample
chr13 32903685 . C T 7555.77 PASS . GT:AD:DP:GQ:PL 0/1:219,340:569:99:7584,0,4763
Now, I create two copies of the same VCF file, and annotate each one of them with two annotation sources. So, the first one becomes:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample
chr13 32903685 . C T 7555.77 PASS CustomOne=1 GT:AD:DP:GQ:PL 0/1:219,340:569:99:7584,0,4763
while the second one becomes:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample
chr13 32903685 . C T 7555.77 PASS CustomTwo=2 GT:AD:DP:GQ:PL 0/1:219,340:569:99:7584,0,4763
I would like now to merge the aforementioned copies, so as to obtain:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample
chr13 32903685 . C T 7555.77 PASS CustomOne=1;CustomTwo=2 GT:AD:DP:GQ:PL 0/1:219,340:569:99:7584,0,4763
Basically, the result I would like to achieve maintains the same #CHROM
, POS
, REF
, ALT
, QUAL
, FILTER
, FORMAT
and sample
columns, and merges the contents of the INFO
column found in each copy.
The solution I tried.
I tried (unsuccessfully) with several options:
bcftool merge
, but this supposes to merge different samples, while I am working with the same samplebcftool concat
, but this concats two VCF filesSnpSift annotate
, but this does not accept a list of files which is greater than two, meaning that I cannot use this command if the number of copies to be merged is greater than two
My question!
Can you suggest me how to proceed?
Thank you for your help.