Combine/Merge Two Phased Vcf Files For Plink Analysis
1
1
Entering edit mode
10.8 years ago
michealsmith ▴ 790

I have two already phased vcf files(one patient and one control), and would like to merge them together.

I've tried vcf-phased-joint, but it requires the same column, ie. the individual numbers should be equal, which sounds weird. Then I tried GATK -T CombineVariants, and it works!

But my questions are:

  1. Is it OK to simply combine/merge two PHASED vcf together? (The optimal way in my mind is to combine patient and control bams and call vcf together, and phased all SNPs in vcf together using GATK-Readsbackedphasing; but it'll be too painful to process these bam files. Actually controls here are 1000 genome data). I mean after merging there'll be many genotype fields missing, is this OK for downstream plink analysis?

  2. Actually how would plink handle missing genotype as well as unphased genotype?

  3. Should I only use SNP for plink? Or it's OK to include indels as well?

Beginner for plink here, so confused Many many thanks!

plink • 5.8k views
ADD COMMENT
0
0
Entering edit mode
10.8 years ago
venks ▴ 740

Hi,

I have did something like this before for plink analysis. We 've always had multiple sample VCF files for plink analysis. It is better you call the variants for both the samples together so you will not have bias... This should essentially output a multisample vcf file which you can use for further plink analysis.

Good luck

ADD COMMENT

Login before adding your answer.

Traffic: 1568 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6