Trio-based analysis: pipeline for phasing the data
Entering edit mode
7.7 years ago
newbee ▴ 40

I obtained trio-based exome data (vcf files) from vendor. VCF files show PGT and PID for many rows. GATK explained them as Physical phasing haplotype information and Physical phasing ID information, respectively. I need to detect de novo, autosomal recessive, and compound heterozygote variants. For the last two, I would like to phase the data. I want to phase the data using the tool 'Beagle'. For that, I would like to consider each trio (unaffected father, unaffected Mother, and affected child), remove PGT and PID, and then run Beagle for phasing the data. Thereafter, I would like to use the tool Gemini to call autosomal recessive and compound heterozygotes. Am I proceeding correctly? Can someone guide me if this is a right approach? Since I spent much time on Gemini, I would like to use this tool. Also, please suggest if I can skip the phasing procedure if I keep the PGT and PID values in VCF file?
I am new in the field and have very limited knowledge. It would be a great help if some additional issues are discussed that I can not think right now.

Thanks a lot.

Trio-based GATK haplotypecaller Beagle Phasing • 3.4k views

Login before adding your answer.

Traffic: 1909 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6