I have a workflow I want to run for a large set of structures but don’t want the HPC admin to ban me for opening too many SSH connections. Can I set a maximum number of SSH connections that Aiida can open at once?
What is the behavior of the Daemon w.r.t. SSH connections? If I submit multiple workflows using the same code on the same computer, does the daemon open one connection for every calcjob in the workflow that uses the remote computer?
Hi, AiiDA has a number of measures to prevent opening too many connections. Only 1 connection will be ever opened for a computer per daemon worker. It will use the same connection for all calcjobs that are running on that computer. Note that this is per daemon worker, so if you start your daemon with 4 workers, you can have 4 connections open to a particular computer at a time.
Besides that, each daemon worker will try to keep the connection open for a time and not open and close it each time. The minimum time between connections being opened can also be configured through the --safe-interval
option when you run verdi computer configure
.
It will also bundle requests to the scheduler for updates. So instead of requesting the status for each job individually, each daemon worker bundles the request for all calcjobs that are running with the same scheduler to not bombard the scheduler. Minimum interval between scheduler polls can also be configured:
load_computer().set_minimum_job_poll_interval(30) # in seconds
See the documentation as well.
One final caveat: the Python API allows you to open a connection (a “transport”) yourself. This operation is not guarded by all the mechanisms mentioned above. So if you open a transport manually in your workflow or calcjob code, then those can violate cluster access rules if you submit enough of them. So be careful when opening transports manually
1 Like