how to add non-reference genes to BED file of intervals for PHG
1
0
Entering edit mode
7 weeks ago

Hi, one of the reasons to build a pangenome is to capture genes which might be absent in the reference genome of interest. While following the documentation of the PHG you get to a point where you need to "Create bedfile of intervals for PHG reference ranges".

My question is how to add non-reference genes to such a file, should I do it by adding genomic ranges in the reference genome that will roughly match the equivalent regions in other assemblies that contain those non-reference genes?

Consider the following region from chr3H in barley adapted from barley_pangenes:

chr start end reference gene assembly2 gene
3H 331491 334552 HORVU.MOREX.r3.3HG0218270 Horvu_10350_3H01G001500
3H NA NA NA Horvu_10350_3H01G001600
3H 414358 417904 HORVU.MOREX.r3.3HG0218320 Horvu_10350_3H01G001700
  • Should I merge these gene models and create a custom range such as 3H:331491-417904 to make sure reads matching non-reference gene Horvu_10350_3H01G001600 are used to build haplotypes?

Thanks for your help, Bruno

pangenome plants PHG • 329 views
ADD COMMENT
2
Entering edit mode
7 weeks ago
lcj34 ▴ 420

hi Bruno - You should not need to create custom ranges. The anchorwave alignment process aligns full genomes to each other using collinear regions identified through GFF file CDS and exon regions. AnchorWave has the ability to identify novel anchors within long inter-anchor regions. Based on this we expect all, or nearly all, of each assembly genome's sequence to be included as haplotype sequence aligned to either a reference defined conserved or non-conserved region. Later, when imputing against the pangenome that includes the haploypes from all assemblies, you should be able to align to these regions.

You can read more about anchorwave from the readme at the gihub page (https://github.com/baoxingsong/AnchorWave) or from the Anchorwave paper (https://www.pnas.org/doi/abs/10.1073/pnas.2113075119)

ADD COMMENT

Login before adding your answer.

Traffic: 2953 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6