Entering edit mode
3.3 years ago
Luther_Blisset
▴
30
I want to calculate a pairwise distance matrix (DM) on several samples (~100). When I look at Plink's documentation for calculating a DM, at no point does it seem to give a clue as to what the input file has to be. https://www.cog-genomics.org/plink/1.9/distance.
Am I missing something really obvious here?
I have a concatenated vcf for these samples. Will this suffice?
Thanks
What do you mean by concatenated VCF? If you've called the variants from multiple samples in a single vcf file i.e. every sample is a column after the INFO columns, that should be enough. If not you need to use
bcftools merge
to merge your vcf files.