R or Python code for pairwise genetic identity from a VCF file
0
0
Entering edit mode
5.6 years ago
Stephanie ▴ 40

Hi everyone,

I'd appreciate some help from someone familiar with R or Python. I'm trying to write a simple piece of code that does the following, with very minimal programming skills:

I have a VCF with missing data. I want to have a loop go down the SNP vector of each individual pair, and if both individuals have no missing data at a given locus, then spit out if the alleles are identical or not. The output would include two numbers for each pair of individuals: 1)How many loci at which both individuals have non-missing data? 2)Of those, at how many loci do the individuals have identical alleles?

Any help appreciated!

R snp next-gen python vcf • 1.2k views
ADD COMMENT
0
Entering edit mode

Take a look at cyvcf2 for parsing vcf files, which can also do what you want.

ADD REPLY

Login before adding your answer.

Traffic: 1507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6