I have list of regions ( chr:start-end) for which I would like to build two consensus haplotypes. I could not use the FastaAlternateReferenceMaker from GATK and it ignores the phasing information with in the region of interest. My question is more detailed in this image:
https://www.dropbox.com/s/hkg4eff9ba8zg37/Screen%20Shot%202015-12-26%20at%2013.57.49.png?dl=0
I could try GATK HaplotypeCaller but I have already called SNPs using freebayes and would like to know if there is any existing method/tool to do this. There are tools that constructs haplotypes but they look complicated and are for different purpose. I am looking for something simple. Otherwise I will end up writing a dirty script, which may not work in all possible scenarios.
Thanks. Dint know about this. Will see if it can do the trick.
I missed out something here. How do I get two VCFs for each strand ?
there is an extra field in the sample column in the new VCF (i don't remember the tag (PG?) it contains the phased genotype separated with a pipe '|'. Use this column to replace the ALT column of the VCF with one of the allele.
It worked. Its little tricky. I am not sure if there are any small glitches but the method works. :-)
The new tag is "HP" and they are not pipes anymore. they changed the format of representation.
Hi @geek_y- I would like to do obtain the two haplotypes also. My vcf file (output from GATK) also has Hp tag. How did you created two SNP files based on a block of HP?
Thanks