I'm having trouble phasing a multi-sample (9-samples) vcf file produced by gatk HaplotypeCaller with Beagle 5.2. I do not have a genetic map or reference panel. I am working with a very heterozygous group of organisms (sea urchins). When I run beagle with the following command,
java -Xmx100g -jar beagle_5.2.jar gt=filtered_calls.vcf.gz out=phased impute=false
I get this error:
Window 470 [NW_022145483.1:11416-24979] Reference markers: 630 Study markers: 630 Burnin iteration 1: 0 seconds Exception in thread "main" java.lang.IllegalArgumentException: 0 at main.RunStats.printEstimatedNe(RunStats.java:260) at main.Main.phaseStage1Variants(Main.java:195) at main.Main.phaseTarg(Main.java:181) at main.Main.phaseAndImpute(Main.java:171) at main.Main.main(Main.java:126
Beagle usually runs fine for about 15 min and outputs 1.4G of phased genotypes, then crashes. I'm not sure what this error code means. I have been playing around with memory usage and window size (anywhere between 5 and 40). Neither has seemed to help. When I change the window size, it crashes while processing different scaffolds. I'm fairly sure that there isn't a problem with my input vcf file as other programs have run successfully using it as input.
Are there other programs that will phase without a genetic map and reference panel? Should I split my vcf files into separate files for each chromosome before phasing with Beagle? I have access to HPC so memory should not be an issue.