Question: compare and merge VCF files
0
gravatar for qwzhang0601
9 months ago by
qwzhang060160
United States
qwzhang060160 wrote:

We have SNP array data and whole exon sequencing based SNP calling results for the same group of samples.

Now we have genotype data in VCF format from both techniques. The samples are the sames. But the list of SNPs can be different, with some overlapped SNPs from both data.

We want to merge the VCF files (of genotypes) by array data and whole exon sequencing. We wonder whether there are some tools can do this for us (e.g., vcftools). Especially, in our case there are about 26k SNPs whose genotype were called by both array and whole genome sequencing data. And for those overlapped SNPs, I think there must be some genotypes were called differently by two techniques, for certain SNPs and individuals. So I also concern how to deal with the inconsistent genotypes calling when merging the two VCF files.

Thanks.

genotype vcf • 546 views
ADD COMMENTlink modified 9 months ago by berrytaylor560 • written 9 months ago by qwzhang060160

VCFtools has vcf-compare and vcf-merge. BCFtools has bcftools stats and bcftools merge. Both should do what you want.

ADD REPLYlink written 9 months ago by h.mon27k

Thanks´╝üI will take a look.

ADD REPLYlink written 9 months ago by qwzhang060160
1
gravatar for Shicheng Guo
9 months ago by
Shicheng Guo7.7k
Shicheng Guo7.7k wrote:

Suppose you have chr22.chip and chr22.imputation to be merged. you can try the following way:

plink --bfile chr22.chip --list-duplicate-vars 
awk '{print $4}' plink.dupvar | grep -v ID > plink.dupvar.id 
plink --bfile chr22.chip --exclude plink.dupvar.id --make-bed --out chr22.chip.rmdup
plink --bfile chr22.imputation --list-duplicate-vars 
awk '{print $4}' plink.dupvar | grep -v ID > plink.dupvar.id 
plink --bfile chr22.imputation --exclude plink.dupvar.id --make-bed --out chr22.imputation.rmdup
plink --bfile chr22.imputation.rmdup --bmerge chr22.chip.rmdup --make-bed --out merge
plink --bfile chr22.chip.rmdup --flip merge-merge.missnp --make-bed --out chr22.chip.rmdup.flip
plink --bfile chr22.imputation.rmdup --bmerge chr22.chip.rmdup.flip --make-bed --out merge
plink --bfile chr22.imputation.rmdup --exclude merge-merge.missnp --make-bed --out chr22.imputation.rmdup.rm3
plink --bfile chr22.chip.rmdup.flip --exclude merge-merge.missnp --make-bed --out chr22.chip.rmdup.flip.rm3
plink --bfile chr22.imputation.rmdup.rm3 --bmerge chr22.chip.rmdup.flip.rm3 --make-bed --out merge
plink --bfile merge  --genome --out merge.ibd

by the way, plink will break all the phase status, so if you want to keep phasestatus. be careful.

ADD COMMENTlink modified 9 months ago • written 9 months ago by Shicheng Guo7.7k

It seems a little bit complex. But thanks.

ADD REPLYlink written 9 months ago by qwzhang060160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1788 users visited in the last hour