Verdi computer test, Creating and Deleting Temporary fails when work_dir is the scratch directory in HPC

Hi,
I am new to aiida as well as hpc for highthroughput worklfow, so applogies in advance if I miss something obvious.

So, the main issue is, verdi computer test is successful for all but the creation and deletion of temporary file when I use the scratch directory (lustre/parallel file system on hpc) as work_dir for configuring the hpc computer :

$ verdi computer show noctua2


Label noctua2
PK 10
UUID 8a16267b-5570-47f3-a10e-c84764f9d953
Description Noctua 2 HPC Cluster at PC2
Hostname fe.noctua2.pc2.uni-paderborn.de
Transport type core.ssh
Scheduler type core.slurm
Work directory /scratch/hpc-prf-spectr/Abdullah/temp
Shebang #!/bin/bash
Mpirun command srun -n {tot_num_mpiprocs}
Default #procs/machine 64
Default memory (kB)/machine 268435456
Prepend text export PATH=/opt/software/pc2/EB-SW/software/OpenMPI/4.1.6-GCC-13.2.0/bin:$PATH
export LD_LIBRARY_PATH=/opt/software/pc2/EB-SW/software/OpenMPI/4.1.6-GCC-13.2.0/lib:$LD_LIBRARY_PATH
Append text echo “Job execution complete.”


$ verdi computer test noctua2 --print-traceback
Report: Testing computer for useraiida@localhost

  • Opening connection… [OK]

  • Checking for spurious output… [OK]

  • Getting number of jobs from scheduler… [OK]: 0 jobs found in the queue

  • Determining remote user name… [OK]: abshahid

  • Creating and deleting temporary file… [Failed]: OSError: Error during mkdir of ‘/scratch’, maybe you don’t have the permissions to do it, or the directory already exists? ([Errno 13] Permission denied)
    Full traceback:
    Traceback (most recent call last):
    File “/opt/conda/lib/python3.10/site-packages/aiida/cmdline/commands/cmd_computer.py”, line 124, in _computer_create_temp_file
    transport.chdir(workdir)
    File “/opt/conda/lib/python3.10/site-packages/aiida/transports/plugins/ssh.py”, line 593, in chdir
    self.sftp.chdir(path)
    File “/opt/conda/lib/python3.10/site-packages/paramiko/sftp_client.py”, line 659, in chdir
    if not stat.S_ISDIR(self.stat(path).st_mode):
    File “/opt/conda/lib/python3.10/site-packages/paramiko/sftp_client.py”, line 493, in stat
    t, msg = self._request(CMD_STAT, path)
    File “/opt/conda/lib/python3.10/site-packages/paramiko/sftp_client.py”, line 822, in _request
    return self._read_response(num)
    File “/opt/conda/lib/python3.10/site-packages/paramiko/sftp_client.py”, line 874, in _read_response
    self._convert_status(msg)
    File “/opt/conda/lib/python3.10/site-packages/paramiko/sftp_client.py”, line 903, in _convert_status
    raise IOError(errno.ENOENT, text)
    FileNotFoundError: [Errno 2] No such file

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
    File “/opt/conda/lib/python3.10/site-packages/aiida/transports/plugins/ssh.py”, line 704, in mkdir
    self.sftp.mkdir(path)
    File “/opt/conda/lib/python3.10/site-packages/paramiko/sftp_client.py”, line 460, in mkdir
    self._request(CMD_MKDIR, path, attr)
    File “/opt/conda/lib/python3.10/site-packages/paramiko/sftp_client.py”, line 822, in _request
    return self._read_response(num)
    File “/opt/conda/lib/python3.10/site-packages/paramiko/sftp_client.py”, line 874, in _read_response
    self._convert_status(msg)
    File “/opt/conda/lib/python3.10/site-packages/paramiko/sftp_client.py”, line 905, in _convert_status
    raise IOError(errno.EACCES, text)
    PermissionError: [Errno 13] Permission denied

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
    File “/opt/conda/lib/python3.10/site-packages/aiida/cmdline/commands/cmd_computer.py”, line 552, in computer_test
    success, message = test(
    File “/opt/conda/lib/python3.10/site-packages/aiida/cmdline/commands/cmd_computer.py”, line 126, in _computer_create_temp_file
    transport.makedirs(workdir)
    File “/opt/conda/lib/python3.10/site-packages/aiida/transports/plugins/ssh.py”, line 689, in makedirs
    self.mkdir(this_dir)
    File “/opt/conda/lib/python3.10/site-packages/aiida/transports/plugins/ssh.py”, line 707, in mkdir
    raise OSError(
    OSError: Error during mkdir of ‘/scratch’, maybe you don’t have the permissions to do it, or the directory already exists? ([Errno 13] Permission denied)

  • Checking for possible delay from using login shell… [OK]
    Warning: 1 out of 6 tests failed

However, all the 6 tests are successful when I use the home directory (small storage). :

$ verdi computer show noctua2


Label noctua2
PK 11
UUID 5e018103-0bd1-457f-8f21-698dda8ca813
Description Noctua 2 HPC Cluster at PC2
Hostname fe.noctua2.pc2.uni-paderborn.de
Transport type core.ssh
Scheduler type core.slurm
Work directory /pc2/users/a/abshahid/aiida_wor_dira
Shebang #!/bin/bash
Mpirun command srun -n {tot_num_mpiprocs}
Default #procs/machine 64
Default memory (kB)/machine 268435456
Prepend text export PATH=/opt/software/pc2/EB-SW/software/OpenMPI/4.1.6-GCC-13.2.0/bin:$PATH
export LD_LIBRARY_PATH=/opt/software/pc2/EB-SW/software/OpenMPI/4.1.6-GCC-13.2.0/lib:$LD_LIBRARY_PATH
Append text echo “Job execution complete.”


$ verdi computer test noctua2
Report: Testing computer for useraiida@localhost

  • Opening connection… [OK]
  • Checking for spurious output… [OK]
  • Getting number of jobs from scheduler… [OK]: 0 jobs found in the queue
  • Determining remote user name… [OK]: abshahid
  • Creating and deleting temporary file… [OK]
  • Checking for possible delay from using login shell… [OK]
    Success: all 6 tests succeeded

I have tried to investigate, this is what I was able to understand: AiiDA relies on SSH-based SFTP operations (using Paramiko) to manage remote work directories during its computer tests. It first attempts to change to the designated work directory using sftp.chdir(), and if that fails (typically because the directory doesn’t exist or isn’t accessible), it then tries to create it using transport.makedirs(). HPC systems like Noctua 2 could have SFTP chrooted to my home directory. If the chroot restriction is true, it means that when AiiDA attempts to access directories outside the home (e.g., a scratch space), the operations fail. I have also tried using symlink but didnot work maybe due to permission issues with symbolic links.

So, my question is , What do you think about the issue? I have tried to look up the discussions here, but could not find any relavant posts, does that mean it is not a typical issue? Is this my HPC specific issue or am I thinking the whole thing wrong and making a very obvious mistake? I highly appreciate any solution or advice about my problem. Thank you. :blush:
noctua2_computer.txt (649 Bytes)
noctua2_ssh_config.txt (400 Bytes)

Hi, assuming that /scratch actually exists, indeed the issue is that AiiDA is trying to create /scratch that is clearly not possible.

If you just ssh into the computer fe.noctua2.pc2.uni-padeborn.de, and do ls /scratch, does it work or it gives an error?
In the second case, you should ask your supercomputer center. In the first case, /scratch does not exist, at least on the system you are trying to SSH to.

From a quick google search, the instructions for your center are these and they say to e.g. rsync as

scp -o 'ProxyJump <your-username>@fe.noctua2.pc2.uni-paderborn.de' <your-files> <your-username>@n2login5:/scratch/<path>

So you actually do not want to stop in fe.noctua2.pc2.uni-paderborn.de but proxy through it, but then connect to one of the login nodes n2login5, but they say

You can use other cluster frontends (n2login1 , n2login2 , …) as the target.

So (assuming you see also your home from the login nodes, as well as can submit/monitor jobs) I suggest you create a new computer where the hostname is n2login5 instead (for instance) and use AiiDA’s supported ProxyJump feature to access it via fe.noctua2.pc2.uni-paderborn.de. Hope it helps! (Otherwise I suggest to ask support also to the supercomputer center)

Giovanni