Question: Merge multiple VCF files (same variants, same sample) into one VCF file
0
gravatar for Eleanore
4 weeks ago by
Eleanore0
Eleanore0 wrote:

Dear all,

I have a problem at hand regarding the manipulation of multiple VCF files (containing the same variants and referred to the same sample) so as to merge their INFO fields..

The context.

Say I have the following VCF file (headers not included):

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  sample
chr13   32903685    .   C   T   7555.77 PASS    .   GT:AD:DP:GQ:PL  0/1:219,340:569:99:7584,0,4763

Now, I create two copies of the same VCF file, and annotate each one of them with two annotation sources. So, the first one becomes:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  sample
chr13   32903685    .   C   T   7555.77 PASS    CustomOne=1 GT:AD:DP:GQ:PL  0/1:219,340:569:99:7584,0,4763

while the second one becomes:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  sample
chr13   32903685    .   C   T   7555.77 PASS    CustomTwo=2 GT:AD:DP:GQ:PL  0/1:219,340:569:99:7584,0,4763

I would like now to merge the aforementioned copies, so as to obtain:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  sample
chr13   32903685    .   C   T   7555.77 PASS    CustomOne=1;CustomTwo=2 GT:AD:DP:GQ:PL  0/1:219,340:569:99:7584,0,4763

Basically, the result I would like to achieve maintains the same #CHROM, POS, REF, ALT, QUAL, FILTER, FORMAT and sample columns, and merges the contents of the INFO column found in each copy.

The solution I tried.

I tried (unsuccessfully) with several options:

  • bcftool merge, but this supposes to merge different samples, while I am working with the same sample
  • bcftool concat, but this concats two VCF files
  • SnpSift annotate, but this does not accept a list of files which is greater than two, meaning that I cannot use this command if the number of copies to be merged is greater than two

My question!

Can you suggest me how to proceed?

Thank you for your help.

annotation vcf • 148 views
ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by Eleanore0
0
gravatar for trausch
4 weeks ago by
trausch730
Germany
trausch730 wrote:

Two INFO fields with the same name "Custom" are not allowed but I think, the recent bcftools versions can relabel INFO fields:

bcftools annotate -a custom1.vcf.gz -c INFO/CustomImported:=INFO/Custom custom2.vcf.gz

ADD COMMENTlink written 4 weeks ago by trausch730

Yeah, sorry, I got a wrong example. I am to re-edit the question putting two different INFO fields... So, does this command allow multiple files too?

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Eleanore0

Maybe there is a more elegant solution but pipes should work:

zcat custom1.vcf.gz | bcftools annotate -a custom2.vcf.gz -c INFO/CustomTwo - | bcftools annotate -a custom3.vcf.gz -c INFO/CustomThree -

ADD REPLYlink written 4 weeks ago by trausch730

This is a solution that I applied at first, but it does not scale since it continuously opens new annotation processes (N-1 if the copies are N), which does not scale. Isn't there a tool that does this operation for me, without launching several annotation processes?

ADD REPLYlink written 4 weeks ago by Eleanore0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 524 users visited in the last hour