I have recently managed to run the Pathfinding step of the Practical Haplotype Graph pipeline (see here; PHG v1.4).
With this I now wanted to get the calculated best path, i.e. the list of haplotype IDs that are part of that path, so I have written a script that does so (by co-opting the decodePathsForMultipleLists() function of the DBLoadingUtils class). I have tested this script by getting the path for my reference genome and gotten the same number of IDs as there are reference ranges, as expected.
However if I get the IDs for some of my imputed paths I get an incredibly low number of IDs. For example one sample has 66 IDs in it's path, despite there being 422,593 reference ranges. I have also not encountered any errors in the log written out for the run. Is this an issue with the chosen parameters, an error, or are all other ranges simply matching the reference genome?
I would greatly apprechiate any pointers on this.
The parameters I used for the run were:
## DB params.... ## other params: #--- Used for writing a pangenome reference fasta(not needed when inputType=vcf) --- pangenomeHaplotypeMethod=assembly_by_anchorwave pangenomeDir=/PHG/outputDir/pangenome indexKmerLength=21 indexWindowSize=11 indexNumberBases=90G #--- Used for mapping reads inputType=fastq readMethod=lineA_run1 keyFile=/PHG/fullRunConfigs/lineA_run1_readMapping_key_file.txt fastqDir=/PHG/inputDir/imputation/fastq/ samDir=/PHG/inputDir/imputation/sam/ lowMemMode=true maxRefRangeErr=0.25 outputSecondaryStats=false maxSecondary=20 fParameter=f15000,16000 minimapLocation=minimap2 #--- Used for path finding pathHaplotypeMethod=assembly_by_anchorwave pathMethod=lineA_run1 maxNodes=1000 maxReads=10000 minReads=1 minTaxa=1 minTransitionProb=0.0005 numThreads=80 probCorrect=0.99 removeEqual=false splitNodes=true splitProb=0.99 usebf=true minP=0.8 #--- Used to output a vcf file for pathMethod #~~~ Optional Parameters ~~~ outVcfFile=/PHG/outputDir/align/lineA_run1_variants.vcf localGVCFFolder=/PHG/inputDir/loadDB/gvcf debugDir=/PHG/debugDir/lineA_run1/