Question: Can't find the "classic" error in merged vcf files with vcf-merge.
gravatar for msimmer92
3.8 years ago by
msimmer92260 wrote:

Hello everyone! Let me provide with a bit of context first. I'm performing a principal component analysis in a two-group sample (groups 1 and 2). The type of data that I have are two separated vcfs, one from each group. To do the PCA in Plink, I needed to generate one single vcf file with the individuals from both groups. For merging, I used vcf-merge command from vcftools, which seemed to run correctly.

The problem: after merging both files, doing the PCA and visualizing in an R graph, I noticed the graph was odd (you can see it below), and a labmate told me "ohh, that's a classic merging error, I've seen it before.. but I don't remember much right know. see if you did something wrong in the merging". I'm new to bioinformatics, so I look again and again but I can't find the error. The commands ran smoothly in each step... and I don't have enough knowledge yet to spot the mistake. As my labmate told me that, I decided to post the question here, since it seemed like a "classic rookie mistake".

Here you have every step of the process, to see if you can spot the problem, and the final graph.

./bgzip group1.vcf 
./tabix group1.vcf.gz
 ./bgzip group2.vcf 
 ./tabix group2.vcf.gz

vcf-merge group1.vcf.gz group2.vcf.gz | bgzip –c > bothmerge.vcf.gz

./plink --vcf bothmerge.vcf.gz --pca --out bothmergepca

(Then, loading the bothmergepca.eigenvec file in R, I plotted the first principal component against the second one).

The expected graph was like a cloud of 2000 dots. Note: I have done PCA and visualized it on R before, so I'm more familiar with that and I am pretty certain that the mistake is not in those steps.

You can see the graph here:

Hope someone can help me, or at least hint me. Thank you for your time !

vcf-merge vcf • 1.1k views
ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by msimmer92260
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1147 users visited in the last hour