What are the existing proposals for how to approach genomic coordinates in a pangenome reference environment?
Entering edit mode
3 months ago
LauferVA 3.7k

This month (may 2023) saw the publication of a first draft of the pangenome reference.

In a singular linear reference, assigning a variant genomic coordinates is straight forward - simply count the number of bases in the reference between the telomere and the variant.

However, in a pangenome reference things arent as straightforward (or are they?). Consider a variant somewhere in chromosome 1. This time, there are many paths to the variant, and depending on the path taken to that variant, the number of bases traversed between the telomere and the variant changes....

further, if one attempts to solve this problem by making a rule, e.g., "the most common path at each locus is the reference for that locus", there are again problems. simply adding additional genomes to the pangenome reference could change which path is the commonest, and thus change the numbering for the reference at all loci distal to that one. similarly, if one is working from a subpopulation, the reference paths may well differ from another subpopulation, etc.

for all these reasons, it is not clear to me how genomic coordinates will be assigned in the context of a pangenome reference. with that as background, my question is, does anyone know of an accessible resource to start learning more about this? are there published proposals about this or even consensus on what to do?

coordinates pangenome genomic • 367 views

Login before adding your answer.

Traffic: 2418 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6