Question: How to merge vcf files with different variants but same samples?
1
gravatar for humeira.tayyab
2.2 years ago by
humeira.tayyab10 wrote:

I have vcf files with exactly same meta region as well as same column names for fix and gt region but different variants. I want to merge them into a single file vcf file with same meta and combined fixed and gt region.like this :

file1.vcf

 ##fileformat=VCFv4.1 
 ##FILTER=<ID=PASS,Description="Passed all filters">
 ##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
 ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
 #CHROM POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1  S2  S3
1   10  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   11  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   12  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   13  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1

file2.vcf

  ##fileformat=VCFv4.1
  ##FILTER=<ID=PASS,Description="Passed all filters">  
 ##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
 ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
 #CHROM POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1  S2  S3
1   14  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   15  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   16  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   17  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1

merged.vcf

 ##fileformat=VCFv4.1
 ##FILTER=<ID=PASS,Description="Passed all filters">
 ##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
 ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
 #CHROM POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1  S2  S3
1   10  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   11  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   12  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   13  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
1   14  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   15  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   16  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   17  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
vcf • 3.5k views
ADD COMMENTlink modified 15 months ago by Renesh1.9k • written 2.2 years ago by humeira.tayyab10

What have you tried? Have you checked vcftools/bcftools? Also, please use the formatting bar (especially the code option) to present your post better. I've done it for you this time. Formatting bar

ADD REPLYlink written 2.2 years ago by RamRS27k
5
gravatar for Kevin Blighe
2.2 years ago by
Kevin Blighe61k
Kevin Blighe61k wrote:

Just use bcftools concat. You should additionally get into the habit of normalising your VCF files prior to performing downstream analyses on them. This can be done with bcftools norm -m-any (I have not done that for the purposes of this answer):

bgzip file1.vcf
bgzip file2.vcf

tabix -p file1.vcf.gz
tabix -p file2.vcf.gz
bcftools concat file1.vcf.gz file2.vcf.gz 

##fileformat=VCFv4.1 
##FILTER=<ID=PASS,Description="All filters passed">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=1>
##bcftools_concatVersion=1.2+htslib-1.2.1
##bcftools_concatCommand=concat file1.vcf.gz file2.vcf.gz
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1  S2  S3
1   10  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   11  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   12  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   13  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
1   14  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   15  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   16  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   17  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
ADD COMMENTlink written 2.2 years ago by Kevin Blighe61k
1
gravatar for Pierre Lindenbaum
2.2 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

(changed)

use picard GatherVcfs : https://broadinstitute.github.io/picard/command-line-overview.html

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Pierre Lindenbaum129k
0
gravatar for cpad0112
2.2 years ago by
cpad011213k
India
cpad011213k wrote:

Since both th vcfs belong to same sample and contain identical headers:

$ cat test1.vcf <(awk '!/#/ {print}' test2.vcf)

##fileformat=VCFv4.1 
##FILTER=<ID=PASS,Description="Passed all filters">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1  S2  S3
1   10  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   11  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   12  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   13  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
1   14  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   15  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   16  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   17  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by cpad011213k
1

cough sorting cough

ADD REPLYlink written 2.2 years ago by RamRS27k

Records are already coordinate sorted.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by cpad011213k

That's not all the data, surely. Better safe than sorry.

ADD REPLYlink written 2.2 years ago by RamRS27k
0
gravatar for Renesh
15 months ago by
Renesh1.9k
United States
Renesh1.9k wrote:

Check this link to merge vcf files https://reneshbedre.github.io/blog/mergevcf.html

ADD COMMENTlink written 15 months ago by Renesh1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 905 users visited in the last hour