Hi all,
Running aiida on a remote HPC, but the jobs aren’t being submitted to the squeue (slurm) by sbatch. I imagine it is an issue on the remote computer side, but just in case I want to see if there is anything to be done on my end.
verdi computer test
results in no issues, but submitting a job to Gaussian on the HPC results in (truncated verdi process report
output):
+-> ERROR at 2024-06-11 11:39:44.169983-04:00
| Traceback (most recent call last):
| File "/Users/chemlab/anaconda3/envs/aiida/lib/python3.12/site-packages/aiida/engine/utils.py", line 202, in exponential_backoff_retry
| result = await coro()
| ^^^^^^^^^^^^
| File "/Users/chemlab/anaconda3/envs/aiida/lib/python3.12/site-packages/aiida/engine/processes/calcjobs/tasks.py", line 145, in do_submit
| return execmanager.submit_calculation(node, transport)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/Users/chemlab/anaconda3/envs/aiida/lib/python3.12/site-packages/aiida/engine/daemon/execmanager.py", line 375, in submit_calculation
| result = scheduler.submit_from_script(workdir, submit_script_filename)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/Users/chemlab/anaconda3/envs/aiida/lib/python3.12/site-packages/aiida/schedulers/scheduler.py", line 409, in submit_from_script
| return self._parse_submit_output(*result)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/Users/chemlab/anaconda3/envs/aiida/lib/python3.12/site-packages/aiida/schedulers/plugins/slurm.py", line 488, in _parse_submit_output
| raise SchedulerError(
| aiida.schedulers.scheduler.SchedulerError: Error during submission, could not retrieve the jobID from sbatch output; see log for more info.
+-> WARNING at 2024-06-11 11:39:44.188223-04:00
| maximum attempts 5 of calling do_submit, exceeded
I’ve gone in and copied the inputs and generated .sh file elsewhere and manually submitted them with sbatch and that works with no issue.
Anyway, I think it’s their problem, but I would like to double check and see if there is anything I can try
Thanks