How to determine parent of origin of phased variants when all members are het
3.6 years ago

I'm a little confused on the format of phased variants in a VCF file. I was under the assumption that A|B is equal to Paternal Haplotype | Maternal Haplotype but I believe it's Haplotype Block A | Haplotype Block B.

So in the case of

KID    DAD    MOM

0|1     0|1      0|1


How does one determine the parent of origin for the derived allele?

phase vcf
3.6 years ago

I believe it's Haplotype Block A | Haplotype Block B.

Indeed. Wherever possible using read-based evidence variants are phased in blocks, but e.g. across the repetitive elements, this information is lost. So this does not correspond to maternal and paternal haplotypes and needs long distance phasing (statistical, or using different sequencing technologies) to get longer 'blocks'

Based on your example it's not possible to determine the parent of origin, but perhaps SNPs in the same block can help nailing this down.

Lame. I was afraid this was the case. The phasing leveraged nanopore long reads, so the blocks are really long. You would think that phasing software that uses inheritance information would encode the parent of origin. Thanks for the insight. To add, if anyone knows of a software that can assign parent of origin to ambiguous cases like this, please comment.

phasing software that uses inheritance information

That's new information, you didn't specify this before.

It should be feasible to identify the parent for each block if those are sufficiently big. You just have to find variant(s) specific to one parent to assign the entire block.