Question: How do variant callers get genotype info?
9 months ago by
afzm20 wrote:

How are variant callers able to compute which copy of the chrs in case of a diploid species has the heterozygous detected variants? What information do they use, just paired end reads?

I could find some statistical data in GATK webpage, but I would like to understand if there is other information used, the rationale behind it the accuracy it would have and the factors that affect this process.

Thank you very much

9 months ago by
Philadelphia, PA
Jeremy Leipzig18k wrote:

Usually the caller has no idea which chromosome homolog a variant is on. It can just see variants that are in the same read or read pair (unlikely for short reads) or it can try to infer which variants are on the same chromosome homolog (phased) using read-backed phasing (as part of the read assembly performed by the haplotype caller).

These in silico methods are spotty at best. Most people who need phasing just use a long-read technology, or they sequence the parents.

9 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum118k wrote:

Mathematical Notes on SAMtools Algorithms

.... good luck...

