Pause a process when launching it

Hi all, is it possible to pause a process while launching it? I want to pause a process and check the result, then decide whether to continue or not. I know there is a pause_processes function to pause the process.

from aiida.engine.processes.control import pause_processes
process = submit(xxx)
pause_processes([process])

But this could take some time, thus the process may already finished. I want some a command like:

process = submit_and_pause(xxx)

so that the pause command can be guaranteed to be executed.

This functionality doesn’t exist yet, but it wouldn’t be difficult to implement. Essentially you would just have to override the aiida.engine.launch.submit method and add a line after this line: https://github.com/aiidateam/aiida-core/blob/ffa054e9bb16a8bdc355e06c454e2c97dc143888/aiida/engine/launch.py#L105

    process_inited = instantiate_process(runner, process, **inputs)
    # now you add the following line
    process_inited.pause()

That should pause the process immediately before it is stopped and the task is sent to RabbitMQ. When a daemon worker picks it up, it will reconstruct it from the checkpoint and it will already be in the paused state (if all works as intended).

This sounds an interesting feature. Would it make sense to make a PR with a new paused=True flag to submit? (even if each new flag “hides” a bit the namespace of possible work chain inputs… i.e., I think you wouldn’t be able to use anymore a name paused for an input of a work chain you want to submit; so maybe it’s not the best for the current design, even if I think I’d prefer a submit(xxx, paused=True) to a new function submit_and_pause)

The shadowing might indeed be a problem. Actually, we recently added the wait and wait_interval arguments and didn’t really think about this. Luckily we didn’t release it yet, so we can still revert it. It is a real shame, because I think it is a great feature, but I am afraid that we will have to revert it since I don’t see a way around it and it could actually break processes that use either a wait or wait_interval keyword.

One potential solution would be to prefix those arguments with an underscore, i.e.:

def submit(process_class, _wait, _wait_interval, _pause,  **inputs):
    ...

submit(cls, _wait=True, **{'wait': Bool(True)})

will work. It is a bit ugly to be fair, but maybe that is still better than having to create another function and force users to import this? What do you guys think

Thanks @sphuber @giovannipizzi for your replies.

I prefer to prefix those arguments with an underscore.

Also need to add process_inited.pause() after this line in the runner

Mmm I don’t know, I also don’t particularly like the underscore prefix, I would be OK only if we don’t find a better option.

I see two other possibilities:

  • define a submit_xxx function that instead gets all these parameters (wait, paused, …) and new ones in the future, but also gets the inputs as a dictionary inputs={} rather than as kwargs. submit_xxx would be a sensible name e.g. submit_advanced (that I don’t like) or something like this. Then we only have two variants and in the future we keep extending the submit_xxx function. This keeps the current behaviour of submitting the rest as kwargs.
  • if we want to always instead have people pass the inputs as a dict, and use only submit and no more functions, we would need to deprecate the old way. However we need to see if it’s a good idea over having 2 functions (and then we need to adapt also other functions such as run etc). Not sure if this the best approach, but just putting it here for discussion.

Good suggestions. I would really like to try and stay away from adding a new function. The use of the submit function as well as its name is so wide-spread, that having to update it would be a large change.

I think we could find a deprecating pathway for using kwargs for the inputs and instead have them passed as a dictionary to the inputs keyword argument. Maybe we could change the signature to the following:

def submit(
    cls,
    inputs: dict[str, Data] | None = None,
    *,
    wait: bool = False,
    wait_interval: int = 5,
    **kwargs
):
    if inputs is None:
        warn.warn_deprecation("""\
            Passing inputs as keyword arguments is deprecated. Please pass as the second positional argument:
            
                inputs = {...}
                submit(cls, inputs)

            or as a keyword argument instead:

                inputs = {...}
                submit(cls, inputs=inputs)
            """
        )
        inputs = kwargs

i.e. we add the inputs parameter that is a positional or keyword argument. The other arguments we make positional only by using the * marker. We can now detect essentially when inputs are not specified and assign the kwargs and emit a deprecation warning. This should preserve backwards compatibility and it allows an alternative for processes that define a wait or wait_interval input that would raise. In the changelog and docs we can suggest these people to already switch to the new approach of using inputs instead of kwargs for the inputs.

I am pretty sure this should work. Am I missing anything? Since we maintain backwards compatibility I think this is acceptable. For the time being, the deprecation warning wouldn’t even show up yet so nothing really changes for now. I think we probably want to support the old behavior for a long time because it is used so ubiquitously.

We can apply the same approach to the run function and of course the equivalents on the Runner class.

How to pass a VAR_POSITION argument into the inputs dictionary. Add a special key? e.g,

{"var_args": ()}

(variable) positional arguments are not supported currently either for process inputs. All process inputs have to be keyword arguments. So I don’t think we would need to add anything here.

I think one can submit a calcfunction, which may contain a var_position argument.

That is true. It used to be impossible to submit process functions. But as we refactored them to be actual Process subclasses, it actually became possible, so I removed the restriction from the submit function. However, there is a bit of an inconsistency, because as you can see the Runner.submit still contains the blocker:

Now I wonder how many people are actually aware of this and are submitting process functions. I doubt very many if any. I also wonder what the specific use case would be. Would it really be necessary for certain use cases to have them submitted to the daemon? If they are part of another process that is being run by the daemon, they are already run on the daemon runner. I guess it might be useful if you need to run a large number of calcfunctions in parallel from an interactive notebook or script and instead of running them sequentially, you can submit them and have the daemon workers do it.

I wonder if we really have to cater to this problem. And if users really need this, they could just write their process function to deal with keyword arguments only? I don’t really like the idea of reserving a specific key in the inputs dictionary to be honest.

I made a test using

process_inited.pause()

It shows the state as Created, instead of Paused

1613  5s ago     add             ⏹ Created

Then, I can play it using verdi process play 1613. It was not as expected, though, it is still good in my use case, which I can pause the next node instead.

I think it is actually properly paused. The daemon didn’t start running it once you submitted it, did it? The “problem” is that the functionality that actually sets the paused state on the ProcessNode (which is what is displayed by verdi process list) only gets hit when the process is actually run by a runner and makes a state transition. But the submit method doesn’t actually start running the process, and when the daemon picks it up, the process is paused and so also doesn’t go through any state change in order to hit the code that proxies the paused state on the node. We could think of having the submit actually run through a first state transition, but would have to look into it to figure out how to do this.

What is the actual use case for this feature by the way @Xing ?

I have opened a PR with the first implementation of the solution to the current submit signature problem: Engine: Make process inputs in launchers positional by sphuber · Pull Request #6202 · aiidateam/aiida-core · GitHub

My use case:
Within the worktree, the user has precise control over the workflow. Suppose a worktree has the following nodes:

node1 --> node2 --> node3 --> node4 --> ...

Before submitting the worktree (or before node2 is finished), one can pause the node3 (even through its process is not created yet). When node3 is ready to submit, it will be submitted and paused so that the user can check the result of node3, then make decisions, e.g., change inputs of node4, or even replace node4 with another node.
So it would be good if the process could run and be paused when it is finished.