Question: Comparing VCF files between two groups (15 vcf files against 15 vcf files)
gravatar for Pin.Bioinf
2.2 years ago by
Pin.Bioinf290 wrote:


I have 15 vcf files for one type of population and 15 vcf files for another type. I want to check the differences between the two, and also the similarities. What changes from one group to another and what remains the same, and a signifcance score if possible.

I have read about PLINK but I am not sure how the pipeline should be. Which steps should I folllow? I read the documentation and it is not clear to me.

I also read about bcftools isec: which is useful to intersect multiple vcf files. So I could merge the 15 vcf files between them and the other 15 vcf files between them and end up with two files: population1_variants.vcf and population2_variants.vcf, and then compare those two against eachother and check for the differences and similarities?

Which approach is better? Is this the way people usually analyze variants among populations? How can I asess significance of the results? Are there any other approaches?

Thank you

variants snp plink vcf • 936 views
ADD COMMENTlink modified 2.2 years ago by Raony Guimarães1.1k • written 2.2 years ago by Pin.Bioinf290
gravatar for Raony Guimarães
2.2 years ago by
Dublin / Ireland
Raony Guimarães1.1k wrote:

It really depends on what you want to achieve with this comparison. You could merge all VCFs and do an association analysis between the two populations using plink to find differences between the two groups or you could do a PCA using all samples to see if the two populations have a clear separation between them.

Try doing an association analysis:

plink --file mydata --assoc

Look for SNPs with statistical significance between the two groups.

ADD COMMENTlink written 2.2 years ago by Raony Guimarães1.1k

Thank you! This seems like a nice approach, and what I was looking for. Would the mydata input be the merged 15samplescase.vcf and 15samplescontrol.vcf ? And those vcf merged should contain only the common variations among each of the 15 samples ?

Thank you

ADD REPLYlink written 2.2 years ago by Pin.Bioinf290
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1641 users visited in the last hour