Question

From Vcf Of Unrelated Samples To Haplotypes

0

Entering edit mode

11.4 years ago

Sean Davis 26k

I have a VCF file from a set of exomes (about 50) and would like end up with a set of likely haplotypes for a small set of regions (genes). I imagine using the LD information in projects like hapmap/1000genomes, this should be approximately possible. We could also use read-based haplotype methods, but with exomes, I don't expect this to be a fruitful exercise.

We are ultimately interested in how germline variation might impact drug sensitivity in our samples, so we want to reduce the number of potential variants to a minimum consistent set.

vcf haplotype • 3.0k views

ADD COMMENT • link 11.4 years ago by Sean Davis 26k

2

Entering edit mode

I have phased a small population (40 individuals) using Beagle. Do you want to impute missing genotypes? WIll you be comparing this with 1kg as a background? The amount of LD will be tricky. Brian Browning's review may help you? http://www.nature.com/nrg/journal/v12/n10/pdf/nrg3054.pdf?WT.ec_id=NRG-201110

ADD REPLY • link 11.4 years ago by Zev.Kronenberg 12k

score 0 · Answer 1 · 2012-12-18

0

Entering edit mode

11.4 years ago

Sean Davis 26k

Zev's pointer to the Browning review was right on target and got me going. GATK has some tools for working with BEAGLE including ProduceBeagleInput and BeagleOutputToVCF. I am most interested in comparing between groups in my own sample set, but comparison to 1kg is something I should try. In any case, I'm moving forward with BEAGLE.

ADD COMMENT • link 11.4 years ago by Sean Davis 26k

0

Entering edit mode

Just a pitfall to avoid: lowmem=true should always be set. I found when I didn't use it BEAGLE would crash randomly.