Question: R or Python code for pairwise genetic identity from a VCF file
0
gravatar for Stephanie
9 months ago by
Stephanie30
Stephanie30 wrote:

Hi everyone,

I'd appreciate some help from someone familiar with R or Python. I'm trying to write a simple piece of code that does the following, with very minimal programming skills:

I have a VCF with missing data. I want to have a loop go down the SNP vector of each individual pair, and if both individuals have no missing data at a given locus, then spit out if the alleles are identical or not. The output would include two numbers for each pair of individuals: 1)How many loci at which both individuals have non-missing data? 2)Of those, at how many loci do the individuals have identical alleles?

Any help appreciated!

python snp next-gen R vcf • 396 views
ADD COMMENTlink modified 9 months ago by Biostar ♦♦ 20 • written 9 months ago by Stephanie30

Take a look at cyvcf2 for parsing vcf files, which can also do what you want.

ADD REPLYlink modified 9 months ago • written 9 months ago by WouterDeCoster39k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 954 users visited in the last hour