Hi, I would like to use Aiida for some complex workflows that involve a lot of tasks that can be written downs as python scripts. I would like to launch those scripts as tasks in Aiida, but most of them live in specific conda/mamba environments, plus for some I want to reinstall them from a repo every time the pipeline is run, i.e. by creating a new venv and install them from scratch.
Is this functionality already implemented somewhere, or do I need to implement this myself? Where would I start?
Thanks for your help!
Perhaps aiida-shell would satisfy your requirements. As a quick example, you can run a simple Python script as follows:
from aiida_shell import launch_shell_job
from aiida.orm import SinglefileData
results, node = launch_shell_job(
'python',
arguments='{script}',
nodes={
'script': SinglefileData('/abs/path/to/script.py')
}
)
This will run /abs/path/to/script.py on the localhost. To customize the environment, for example to load a conda env, have a look at this how-to guide.
It is not fully clear from your message if you want the creation of the env actually be part of the workflow managed by AiiDA. I am not sure if that would be advisable, but since with aiida-shell you can also just run a bash script through AiiDA, you could just wrap all of that in a bash script and run that.
Thanks for the quick reply. That is helpful indeed. One follow-up: you said it might not be advisable to reinstall a python virtual environment as part of a workflow.
Is there a specific argument that you have in mind? For my use case, it would be important to start with a clean environment every time.
Mainly that it would be quite costly to redo it every time, but I guess if that is simply one of your requirements, then why not. I have never done it nor even considered it, so I cannot tell you of any caveats off the bat. Guess you will be exploring this use case for all of us please do let us know your findings
I was thinking about your use case some more and I was wondering: why did you choose AiiDA? Is there a specific feature that you want? I am curious if you considered other options, like e.g. Common Workflow Language? If your workflow consists of simply chaining Python scripts together with the management of the environments, that might be more suitable. AiiDA seems perhaps a bit of a heavy alternative
Well, this is going to become more complicated gradually. I.e., we have workflows that do optimization loops on Finite-Element-Models (using HPC resources), i.e. dynamic tasks with an if(err<value) break condition in between.
Hi @sphuber, do you think is the PortableCode somehow fit this use case as well? Or the portable code use cases can all be covered by the aiida-shell already?
I think this could also be used yes, but it wouldn’t really add anything over aiida-shell. I think aiida-shell covers all functionality and is easier to use/more flexible, as it doesn’t require to setup an explicit Code instance.
That being said, I actually mentioned using the ContainerizedCodes in response to a private message. Given that reproducibility is such a strong requirement for their use case, using containers (e.g. using Docker) might be very interesting.