PHG - ImputePipelinePlugin fails when trying to imputing SNPs on a gvcf file.
1
0
Entering edit mode
5 months ago
sjp6181 • 0

Hello everyone, I hope you're doing great.

I'm trying to impute a gvcf using a PHG database. As far as I'm concerned and because of the logs (attached here) of the steps 1 and 2 in the PHG Wiki guide, It seems that I have stablished and populated the PHG db with haplotypes correctly (there is not a single 'ERROR' message in any log) . The problem comes when I run the Imputation part on a example gvcf, where I get the next error on the net.maizegenetics.pangenome.hapCalling.SNPToReadMappingPlugin - Processing record: 100_Ma100,wgsFlowcell,Ma100_.vcf.gz,wgs step:

ERROR net.maizegenetics.plugindef.AbstractPlugin - currentIndexLine must not be null

The command that I used was:

singularity exec -B ${WORKING_DIR}/:/phg/ ${WORKING_DIR}/phg_16.simg /tassel-5-standalone/run_pipeline.pl -Xmx20G -debug -configParameters imputevcfconfig.txt -ImputePipelinePlugin -imputeTarget map -localGVCFFolder /phg/inputDir/loadDB/gvcf/ -localGVCFDir /phg/inputDir/loadDB/gvcf/  -endPlugin > 08_VCF_Imputation.log

And the pipeline stops. Also, the pangenome folder at the outputDir/ is empty; and the vcfIndex file at the outputDir/ only contains the headers, and not any other information, which makes me wonder if there's any previous mistake that might be causing these problems.

I attach the logs for each step, the configure files, keys and the example gvcf in the next link:

https://drive.google.com/drive/folders/1s318N3OCQLm_okDLr5UIYjRED05jl7XK?usp=sharing

Any help or guidance would be much appreciated. If you need any other information to clarify what might be happening please let me know. Thank you!

phg • 341 views
ADD COMMENT
1
Entering edit mode
5 months ago
pjb39 ▴ 200

The index file is empty because the consensus step collapsed everything to a single genome, probably reference, which means consensus haplotypes have no variants. This happened because mxDiv was too high for your dataset. My suggestion is to skip the consensus step. To do that for imputation, set pangenomeHaplotypeMethod and pathHaplotypeMethod to GATK_PIPELINE.

ADD COMMENT

Login before adding your answer.

Traffic: 1834 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6