Question: R or Python code for pairwise genetic identity from a VCF file
gravatar for Stephanie
9 months ago by
Stephanie30 wrote:

Hi everyone,

I'd appreciate some help from someone familiar with R or Python. I'm trying to write a simple piece of code that does the following, with very minimal programming skills:

I have a VCF with missing data. I want to have a loop go down the SNP vector of each individual pair, and if both individuals have no missing data at a given locus, then spit out if the alleles are identical or not. The output would include two numbers for each pair of individuals: 1)How many loci at which both individuals have non-missing data? 2)Of those, at how many loci do the individuals have identical alleles?

Any help appreciated!

python snp next-gen R vcf • 396 views
ADD COMMENTlink modified 9 months ago by Biostar ♦♦ 20 • written 9 months ago by Stephanie30

Take a look at cyvcf2 for parsing vcf files, which can also do what you want.

ADD REPLYlink modified 9 months ago • written 9 months ago by WouterDeCoster39k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 954 users visited in the last hour