Question: How to merge vcf files with different variants but same samples?
1
gravatar for humeira.tayyab
2.9 years ago by
humeira.tayyab10 wrote:

I have vcf files with exactly same meta region as well as same column names for fix and gt region but different variants. I want to merge them into a single file vcf file with same meta and combined fixed and gt region.like this :

file1.vcf

 ##fileformat=VCFv4.1 
 ##FILTER=<ID=PASS,Description="Passed all filters">
 ##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
 ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
 #CHROM POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1  S2  S3
1   10  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   11  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   12  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   13  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1

file2.vcf

  ##fileformat=VCFv4.1
  ##FILTER=<ID=PASS,Description="Passed all filters">  
 ##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
 ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
 #CHROM POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1  S2  S3
1   14  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   15  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   16  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   17  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1

merged.vcf

 ##fileformat=VCFv4.1
 ##FILTER=<ID=PASS,Description="Passed all filters">
 ##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
 ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
 #CHROM POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1  S2  S3
1   10  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   11  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   12  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   13  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
1   14  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   15  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   16  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   17  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
vcf • 4.9k views
ADD COMMENTlink modified 23 months ago by Renesh2.0k • written 2.9 years ago by humeira.tayyab10

What have you tried? Have you checked vcftools/bcftools? Also, please use the formatting bar (especially the code option) to present your post better. I've done it for you this time. Formatting bar

ADD REPLYlink written 2.9 years ago by Ram32k
5
gravatar for Kevin Blighe
2.9 years ago by
Kevin Blighe71k
Republic of Ireland
Kevin Blighe71k wrote:

Just use bcftools concat. You should additionally get into the habit of normalising your VCF files prior to performing downstream analyses on them. This can be done with bcftools norm -m-any (I have not done that for the purposes of this answer):

bgzip file1.vcf
bgzip file2.vcf

tabix -p file1.vcf.gz
tabix -p file2.vcf.gz
bcftools concat file1.vcf.gz file2.vcf.gz 

##fileformat=VCFv4.1 
##FILTER=<ID=PASS,Description="All filters passed">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=1>
##bcftools_concatVersion=1.2+htslib-1.2.1
##bcftools_concatCommand=concat file1.vcf.gz file2.vcf.gz
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1  S2  S3
1   10  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   11  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   12  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   13  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
1   14  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   15  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   16  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   17  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
ADD COMMENTlink written 2.9 years ago by Kevin Blighe71k
1
gravatar for Pierre Lindenbaum
2.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum134k wrote:

(changed)

use picard GatherVcfs : https://broadinstitute.github.io/picard/command-line-overview.html

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Pierre Lindenbaum134k
0
gravatar for cpad0112
2.9 years ago by
cpad011215k
Hyderabad India
cpad011215k wrote:

Since both th vcfs belong to same sample and contain identical headers:

$ cat test1.vcf <(awk '!/#/ {print}' test2.vcf)

##fileformat=VCFv4.1 
##FILTER=<ID=PASS,Description="Passed all filters">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Read Depth">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1  S2  S3
1   10  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   11  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   12  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   13  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
1   14  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 0/0 0/1
1   15  .   C   A   .   .   DP=3;CALLER=Samtools    GT  .   .   1/1
1   16  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/0 0/0 0/0
1   17  .   C   A   .   .   DP=3;CALLER=Samtools    GT  0/1 1/1 1/1
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by cpad011215k
1

cough sorting cough

ADD REPLYlink written 2.9 years ago by Ram32k

Records are already coordinate sorted.

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by cpad011215k

That's not all the data, surely. Better safe than sorry.

ADD REPLYlink written 2.9 years ago by Ram32k
0
gravatar for Renesh
23 months ago by
Renesh2.0k
United States
Renesh2.0k wrote:

Check this link to merge vcf files https://reneshbedre.github.io/blog/mergevcf.html

ADD COMMENTlink written 23 months ago by Renesh2.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1054 users visited in the last hour
_