When I run the following command:
vg sim -x graph.gbz -g graph.gbwt -n 10 -l 150 -p 335 -v 130 -m SAMPLE -a --multi-position -t 30 > simreads.gam
I receive these errors:
Inserting 1 GBWT threads into the graph
error: [insert_gbwt_path()] path name already exists: SAMPLE_chr1
error: [insert_gbwt_path()] path name already exists: SAMPLE_chr10
... etc. for every chromosome
Inserted 0 paths
The command seems to run fine otherwise and produces a .gam file of the simulated reads. For context, I produced the .gbz file using minigraph-cactus and the .gbwt was constructed from the .gbz file using the vg gbwt command. I encounter the same error if I use a .xg file instead of a .gbz for the -x argument. The error also occurs if I remove the --multi-position argument. This is in vg v1.52.0.
What might be causing this, and is it something to worry about?
Thank you for your help!
I think the problem might be that you are supplying both a GBWT and a GBZ file. The GBZ file actually contains a GBWT inside it.
I've discovered what the outcome of this error is, despite a .gam file being produced - vg sim is unable to simulate reads from the specified sample, and seems to revert to simulating reads from any sample in the graph instead, as far as I can tell.
For instance, when I run vg sim with the same parameters but specify two different samples from within the graph (-m) using the same seed, the reads generated are identical (though the reads are in a different order) and the .gam file sizes are the same.