Closed:VCF file analysis
0
0
Entering edit mode
3.7 years ago
Nyksubuz ▴ 10

I have a VCF file and have few queries to solves. Can someone help me to proceed with the same?

The questions are as following:

  1. How many variant records does the file contain?
  2. How many genotype calls are there per variant record?
  3. Are the genotype calls phased or unphased?
  4. Write code or pseudo code (in any language of your choosing) to calculate allele frequencies for each variant in the file
  5. Design a relational database schema to store the following information: ● variant ID ● chromosomal location of the variant ● the alleles and their corresponding frequencies
  6. Write code or pseudo code to populate your database schema from the VCF file
  7. How might you store the genotypes such that they could be retrieved quickly, for a project that has produced genotypes for ~1200 individuals across ~80 million sites?
vcf variants genome gene sequence • 244 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2578 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6