Hi,
I'm still trying to wrap my head around the new human pangenome reference and would like some advice on how to go about analyzing some of the WES (hg38/hg19 baits) that I currently have.
How should I calculate the coverage of a specific gene, say, BRAC1, based on the human pangenome annotation? Would it make sense to compare the coverage at all or should I be focusing on the coverage of the best ancestry-matched pangenome reference?
Is there a visualization tool to show how reads are mapped to the pangenome at certain gene? I am envisioning the 47 pangenomes being displayed in each row and reads are piled up under each reference.
Thanks so much for the guidance and advice!
I don't know of any tool to visualize all of the haplotypes simultaneously like you are describing. The most capable visualization tool for mapped reads is probably sequence tube map, but it's a graph-based layout. You can use it query reference regions if you want to look at a particular region. If you want to calculate coverage, there are tools to do that in
vg pack
, but first you'll need to convert the reference region into node IDs.great response - in your first sentence you mention not knowing of such a tool - in what ways does the vg sequence tube map you link to not do this?
ty
You will not see all of the haplotypes separately as independent, linear tracks, each with its own read alignments. Instead, the (distinct) haplotypes are shown as walks through a graph. Also, the alignments are shown one time (rather than once for each haplotype), also threaded through the graph.
Thank you for the response! What is the best way to convert the reference into node ID's? Is there a liftover tool to convert hg38's coordinates to the new reference?
I also aligned my reads to the human pangenome reference, and I am interested at the HLA area and will also be happy to hear how to evaluate the coverage, visualize the mapping and create a vcf from my alignments