Looking for input on writing tool for analyzing haplotypes from 1KG
0
0
Entering edit mode
6.9 years ago
dnygard • 0

Hey all. So to outline my goal right now I'm currently working on my honours thesis, and my project involves interpreting the different haplotypes of two genes and seeing if they have an evolutionary link. I am using 1000 Genome Project phase 3 data, which is already phased and seems like it should be straightforward to pop into a program and do some analysis on haplotypes. The closest thing I could find for this was haploview, but there are a slew of problems that make haploview a no go for me. I've basically decided to just write a tool in R that can take phased .vcf files, do some statistical magic, and pop out a file with nice, easy to read stats and graphics. Before I start working on the tool I just wanted to get some input from anyone working in this area to see if there's anything they think would be useful for me to implement in this package. For now the things that are important to my project are as follows:

able to take multiple .vcf files as input
batch process capable for use on a computing cluster
ability to tag specific SNPs, or just work on all SNPs above a MAF cutoff
calculation of maybe FST, Hardy Weiberg, linkage disequilibreum, etc
maybe output some nice looking visualizations of the data

This could be a very bare-bones program that I just use for my purposes, but my thesis advisor has alluded to the option of me making a more robust package and just use that as my project altogether. If there's enough interest in something like this, or you folks have any input for me I may go that route. Any input is much appreciated.

genome R SNP haplotype 1kg • 1.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 1711 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6