Question: compare and merge VCF files
0
gravatar for qwzhang0601
16 months ago by
qwzhang060170
United States
qwzhang060170 wrote:

We have SNP array data and whole exon sequencing based SNP calling results for the same group of samples.

Now we have genotype data in VCF format from both techniques. The samples are the sames. But the list of SNPs can be different, with some overlapped SNPs from both data.

We want to merge the VCF files (of genotypes) by array data and whole exon sequencing. We wonder whether there are some tools can do this for us (e.g., vcftools). Especially, in our case there are about 26k SNPs whose genotype were called by both array and whole genome sequencing data. And for those overlapped SNPs, I think there must be some genotypes were called differently by two techniques, for certain SNPs and individuals. So I also concern how to deal with the inconsistent genotypes calling when merging the two VCF files.

Thanks.

genotype vcf • 783 views
ADD COMMENTlink modified 16 months ago by berrytaylor560 • written 16 months ago by qwzhang060170

VCFtools has vcf-compare and vcf-merge. BCFtools has bcftools stats and bcftools merge. Both should do what you want.

ADD REPLYlink written 16 months ago by h.mon29k

Thanks´╝üI will take a look.

ADD REPLYlink written 16 months ago by qwzhang060170
1
gravatar for Shicheng Guo
16 months ago by
Shicheng Guo8.1k
Shicheng Guo8.1k wrote:

Suppose you have chr22.chip and chr22.imputation to be merged. you can try the following way:

plink --bfile chr22.chip --list-duplicate-vars 
awk '{print $4}' plink.dupvar | grep -v ID > plink.dupvar.id 
plink --bfile chr22.chip --exclude plink.dupvar.id --make-bed --out chr22.chip.rmdup
plink --bfile chr22.imputation --list-duplicate-vars 
awk '{print $4}' plink.dupvar | grep -v ID > plink.dupvar.id 
plink --bfile chr22.imputation --exclude plink.dupvar.id --make-bed --out chr22.imputation.rmdup
plink --bfile chr22.imputation.rmdup --bmerge chr22.chip.rmdup --make-bed --out merge
plink --bfile chr22.chip.rmdup --flip merge-merge.missnp --make-bed --out chr22.chip.rmdup.flip
plink --bfile chr22.imputation.rmdup --bmerge chr22.chip.rmdup.flip --make-bed --out merge
plink --bfile chr22.imputation.rmdup --exclude merge-merge.missnp --make-bed --out chr22.imputation.rmdup.rm3
plink --bfile chr22.chip.rmdup.flip --exclude merge-merge.missnp --make-bed --out chr22.chip.rmdup.flip.rm3
plink --bfile chr22.imputation.rmdup.rm3 --bmerge chr22.chip.rmdup.flip.rm3 --make-bed --out merge
plink --bfile merge  --genome --out merge.ibd

by the way, plink will break all the phase status, so if you want to keep phasestatus. be careful.

ADD COMMENTlink modified 16 months ago • written 16 months ago by Shicheng Guo8.1k

It seems a little bit complex. But thanks.

ADD REPLYlink written 16 months ago by qwzhang060170
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1139 users visited in the last hour