Hello,
The problem is consistent - I get output files for step 6.1 missing .coor and .vel. The wall time ends and I still don't have all the files from step6.1 to move to step 6.2. It seems as if the process is stuck at equilibration step6.1.
Anyone had similar problem with HMMM builder from CHARMM GUI?
I have used helios3 and now beluga servers, yet my equilibration takes extremely long time. I used GPUs on one server and CPUs on another, yet same issue persists. I used two similar scripts - same problem.
When I look at TIMING in log file, it seems to be fast then suddenly timing increasing to 16 or 20 hours left and keeps being high when the job time is actually about to finish.
Here is the URL to my folder, with namd directory inside it.
---> zipped sftp://asma97@beluga.computecanada.ca/lustre04/scratch/asma97/charmm-gui-8404446693-noCNBD2.zip
-->unzipped: sftp://asma97@beluga.computecanada.ca/lustre04/scratch/asma97/charmm-gui-8404446693-noCNBD2
Here are the first few lines of log files (I have no .out file):
Warning> Randomization of virtual memory (ASLR) is turned on in the kernel, thread migration may not work! Run 'echo 0 > /proc/sys/kernel/randomize_va_space' as root to disable it, or try running with '+isomalloc_sync'.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (40-way SMP).
Charm++> cpu topology info is gathered in 0.017 seconds.
Info: NAMD 2.13 for Linux-x86_64-multicore
..
Info: Based on Charm++/Converse 60800 for multicore-linux64-icc
Info: Built Sat Mar 9 17:52:35 UTC 2019 by ebuser on build-node.computecanada.ca
Info: 1 NAMD 2.13 Linux-x86_64-multicore 40 blg9118.int.ets1.calculquebec.ca asma97
Info: Running on 40 processors, 1 nodes, 1 physical nodes.
Info: CPU topology information available.
Thank you.
Best,
Asma Feriel