vg tools is running, but memory consumption is not happened and log files are not updated
4 months ago

Hi there, I use vg tools for giraffe mapping.

I started vg tools 2022.01.13 and now(2022.01.17).

vg makes log files : chunked fasta, chunked vcf file, etc...

I has checked log files. log files are not updated. So i check linux top command result to check vg consumption of CPU and memory.

vg consumes CPU but don't consume memory. i thought vg was running.

But run time is long~. So currently i guesses that vg run is fake.

How I do check vg run information?

My vg tools command is below.

vg autoindex --workflow giraffe --prefix ./hg38_based_variation_graph/hg38_pangenome -r ./fasta/resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta -v ./vcf/test.vcf.gz -T ./hg38_based_variation_graph/ --threads 2 --target-mem 35G --verbosity 1

4 months ago

There are steps in the indexing pipeline that do take a significant amount of time but not much memory, so this very well might be expected. It's hard for me to say anything for sure without seeing your log file though. Perhaps you could try again with --verbosity 2 and send the output?

Thanks your comment! I tried command with --verbosity 2. I saw same message and log files were not updated. log files were same!

output message is below

[IndexRegistry]: Checking for phasing in VCF(s).

[IndexRegistry]: Provided: VCF w/ Phasing

[IndexRegistry]: Chunking inputs for parallelism.

[IndexRegistry]: Chunking FASTA(s).

[IndexRegistry]: Chunking VCF(s).

--> no change over 2 days

That is a step where we expect longer compute times with very little memory use though, which is consistent with what you are seeing. Are you using a very large VCF file? One thing that might help is to break it into separate VCFs by contig. Otherwise, there aren't a lot of good alternatives to just reading the entire file serially.