Question: R or Python code for pairwise genetic identity from a VCF file
Hi everyone,

I'd appreciate some help from someone familiar with R or Python. I'm trying to write a simple piece of code that does the following, with very minimal programming skills:

I have a VCF with missing data. I want to have a loop go down the SNP vector of each individual pair, and if both individuals have no missing data at a given locus, then spit out if the alleles are identical or not. The output would include two numbers for each pair of individuals: 1)How many loci at which both individuals have non-missing data? 2)Of those, at how many loci do the individuals have identical alleles?

Any help appreciated!

python snp next-gen R vcf • 396 views
Take a look at cyvcf2 for parsing vcf files, which can also do what you want.

