Deleted:How to extract phased haplotypes from GATK HaplotypeCaller
0
0
Entering edit mode
13 months ago
Michael 54k

I would like to extract the physically phased haplotypes from a VCF file generated by GATK's HaplotypeCaller on Illumina data of some isolates from different yeast (S. cerevisiae) strains. According to this FAQ:

In the format field of a PGT (Pre-Implantation Genetic Testing) VCF, you may find a description similar to this in the metadate:

 ##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another">

I don't quite understand the relevance of PGT in this context. Does it mean, the VCF has to be called in specific way? My vcf files do not contain any such field, "PGT", nor any grep match for "phas" (except the default parameters)?

I know that short reads are not ideal for phasing but some heterozygous variants are within < 150bp, so it might be possible to phase them. Ploidy was set to 4 based on published data.

How to extract four phased Haplotypes in FASTA format within GATK or any other tool (I know WhatsHap)?


HaplotypeCaller was run on each isolate using these arguments:

PLOIDY=4
REF=reference-genome/GCF_000146045.2_R64_genomic.fna # yeast RefSeq assembly
gatk --java-options "-XX:ConcGCThreads=10 -XX:ParallelGCThreads=10  -Xmx200G -Djava.io.tmpdir=$TMPDIR" \
HaplotypeCaller --ploidy $PLOIDY --native-pair-hmm-threads 40 -R $REF \
-G StandardAnnotation -G AS_StandardAnnotation -G StandardHCAnnotation \
 --tmp-dir $TMPDIR \
 -I bam/$BASE-dedup.bam -O vcf/$BASE.g.vcf -ERC GVCF &

Then, a joint variant call was created by running GenomicsDBImport and GenotypeGVCFs.

In addition, default options noted in the vcf file

--do-not-run-physical-phasing false

So, I interpret this as such that phasing is ON.

phasing gatk vcf HaplotypeCaller variant • 740 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2530 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6