Question: Generate two haplotypes from bam and vcf
1
gravatar for geek_y
2.9 years ago by
geek_y8.8k
geek_y8.8k wrote:

I have list of regions ( chr:start-end) for which I would like to build two consensus haplotypes. I could not use the FastaAlternateReferenceMaker from GATK and it ignores the phasing information with in the region of interest. My question is more detailed in this image:

https://www.dropbox.com/s/hkg4eff9ba8zg37/Screen%20Shot%202015-12-26%20at%2013.57.49.png?dl=0

I could try GATK HaplotypeCaller but I have already called SNPs using freebayes and would like to know if there is any existing method/tool to do this. There are tools that constructs haplotypes but they look complicated and are for different purpose. I am looking for something simple. Otherwise I will end up writing a dirty script, which may not work in all possible scenarios.

snp freebayes gatk vcf • 1.1k views
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by geek_y8.8k
2
gravatar for Pierre Lindenbaum
2.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum114k wrote:

just a quick suggestion: use ReadBackedPhasing ( https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_phasing_ReadBackedPhasing.php ) to generate a VCF with the phasing information. Use this phasing to produce two VCFs for each strand, and then go back to FastaAlternateReferenceMaker ?

ADD COMMENTlink written 2.9 years ago by Pierre Lindenbaum114k

Thanks. Dint know about this. Will see if it can do the trick.

ADD REPLYlink written 2.9 years ago by geek_y8.8k

I missed out something here. How do I get two VCFs for each strand ?

ADD REPLYlink written 2.9 years ago by geek_y8.8k

there is an extra field in the sample column in the new VCF (i don't remember the tag (PG?) it contains the phased genotype separated with a pipe '|'. Use this column to replace the ALT column of the VCF with one of the allele.

ADD REPLYlink written 2.9 years ago by Pierre Lindenbaum114k
1

It worked. Its little tricky. I am not sure if there are any small glitches but the method works. :-)

The new tag is "HP" and they are not pipes anymore. they changed the format of representation.

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by geek_y8.8k

Hi @geek_y- I would like to do obtain the two haplotypes also. My vcf file (output from GATK) also has Hp tag. How did you created two SNP files based on a block of HP?

Thanks

ADD REPLYlink written 11 months ago by gaurav.neuro0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 856 users visited in the last hour