Hi, I have a variant file subset_filtered.vcf.gz and want to calculate the kinship values from the file. Could anyone please guide me on calculating the kinship matrix to generate the heatmap from this?
You can use the vcftools relatedness2 module. Which implements the method describes in the paper: https://academic.oup.com/bioinformatics/article/26/22/2867/228512
vcftools --gzvcf subset_filtered.vcf.gz --relatedness2
You can then use multiqc to create the heatmap. -ip produces interactive graphs, -s writes full sample names, -f forces overwrite.
multiqc -ip -f -s ./
The last vcftools release was in 2018. It has been superseded by bcftools for most purposes, and by plink 1.9/2.0 for most analytical functions which didn't get included in bcftools.
In this case,
plink2 --vcf subset_filtered.vcf.gz --make-king-table
is far more efficient than vcftools. I just tested this on a Mac on 10% of 1000 Genomes phase 3 chr21, and it took 1.1 sec using plink2 and more than 22 minutes using vcftools, even though plink2 was forced to waste time converting the VCF to its native file format.
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy