Aiida-workgraph: Using while task to loop DFT calculations

Hi @Xing ,

Here I am again with questions about using aiida-workgraph. I am trying to build the following workflow,

  1. Run a DFT calculation that optimizes the cell (keeping atoms fixed)
  2. Run a DFT calculation that optimizes atomic positions (keeping cell fixed), use as input structure the output structure of (1)
  3. Check output forces and stress of (2), if not below threshold start over at (1), passing the output structure of (2) as its input structure

I hope I am explaining this clearly enough. I have been trying to follow the example here (Use while loop - AiiDA WorkGraph documentation) for using the while task and the example here (Use Context to pass data between tasks - AiiDA WorkGraph documentation) for passing data between tasks. So far I have come up with the following,

wg = WorkGraph("cell_atoms_relax")

# Initialize context
wg.context = {
    'struct': node1.inputs.structure,
    'folder': node1.outputs.remote_folder,
    'forces_stress': node1.outputs.forces_and_stress,
}

# Create while task
struct_converged = wg.add_task(structure_converged_siesta, name="converged", forces_and_stress='{{forces_stress}}', forces_threshold=orm.Float(0.04), stress_threshold=orm.Float(1 / cs.RyAng3Tokbar))
while1 = wg.add_task("While", max_iterations=100, conditions=struct_converged.outputs["result"])

# Create tasks within while loop
struct_ctx1 = wg.add_task("workgraph.from_context", name="struct_ctx1", key="struct")
folder_ctx1 = wg.add_task("workgraph.from_context", name="folder_ctx1", key="folder")

relax_cell = wg.add_task(SiestaBaseWorkChain, name="relax_cell")
relax_cell.set({
    'pseudos': node1.inputs.pseudos,
    'basis': node1.inputs.basis,
    'clean_workdir': node1.inputs.clean_workdir,
    'code': node1.inputs.code,
    'kpoints': node1.inputs.kpoints,
    'max_iterations': orm.Int(5),
    'options': node2.inputs.options,
    'parameters': orm.Dict(dict=pdict_cell.get_dict()),
    'structure': struct_ctx1.outputs["result"],
    'parent_calc_folder': folder_ctx1.outputs["result"],
})
relax_cell.set_context({
    'output_structure': 'struct',
    'remote_folder': 'folder',
})

struct_ctx2 = wg.add_task("workgraph.from_context", name="struct_ctx2", key="struct")
folder_ctx2 = wg.add_task("workgraph.from_context", name="folder_ctx2", key="folder")
struct_ctx2.waiting_on.add("relax_cell")
folder_ctx2.waiting_on.add("relax_cell")

relax_atoms = wg.add_task(SiestaBaseWorkChain, name="relax_atoms")
relax_atoms.set({
    'lua.md_run': node3.inputs.lua.md_run,
    'lua.script': node3.inputs.lua.script,
    'pseudos': node1.inputs.pseudos,
    'basis': node1.inputs.basis,
    'clean_workdir': node1.inputs.clean_workdir,
    'code': node1.inputs.code,
    'kpoints': node1.inputs.kpoints,
    'max_iterations': orm.Int(5),
    'options': node2.inputs.options,
    'parameters': orm.Dict(dict=pdict_atoms.get_dict()),
    'structure': struct_ctx2.outputs["result"],
    'parent_calc_folder': folder_ctx2.outputs["result"],
})
relax_atoms.set_context({
    'output_structure': 'struct',
    'remote_folder': 'folder',
    'forces_and_stress': 'forces_stress',
})

while1.children.add([
    "struct_ctx1",
    "folder_ctx1",
    "relax_cell",
    "struct_ctx2",
    "folder_ctx2",
    "relax_atoms",
])

Here node1, node2, node3 are SiestaBaseWorkChain nodes from earlier calculations, and pdict_cell and pdict_relax are dictionaries with input parameters I prepared beforehand. Also the comparison function is defined like this,

from aiida_workgraph import task
import numpy as np


@task.calcfunction()
def structure_converged_siesta(forces_and_stress, forces_threshold, stress_threshold):
    # Check if forces are below threshold
    forces_conv = np.all(np.abs(forces_and_stress.get_array('forces')) < forces_threshold.value)

    # Check if stress is below threshold
    stress_conv = np.all(np.abs(forces_and_stress.get_array('stress')) < stress_threshold.value)

    return forces_conv and stress_conv

However, if I try to submit this workgraph I get a very obscure error message:

aiida_workgraph/utils/analysis.py:91, in WorkGraphSaver.build_task_link(self)
     89         output["links"] = []
     90 for link in self.wgdata["links"]:
---> 91     to_socket = [
     92         socket
     93         for name, socket in self.wgdata["tasks"][link["to_node"]][
     94             "inputs"
     95         ].items()
     96         if name == link["to_socket"]
     97     ][0]
     98     from_socket = [
     99         socket
    100         for name, socket in self.wgdata["tasks"][link["from_node"]][
   (...)
    103         if name == link["from_socket"]
    104     ][0]
    105     to_socket["links"].append(link)

IndexError: list index out of range

This sounds like an error somewhere in the internals of aiida-workgraph and doesn’t really help me to figure out what is going wrong. Do you have any ideas on how to debug this? Or can you already spot an error in how I am setting up the workgraph? Or do you have other suggestions for creating a workflow like this with aiida-workgraph that is easier and/or less error prone? Any help/insight is appreciated.

1 Like

Also, using the {{name}} syntax from this tutorial (Use Context to pass data between tasks - AiiDA WorkGraph documentation) did not work for setting the input ports of the DFT calculation (I got an input port validation error that complained that the input didn’t have the required StructureData type but was of type str). So I had to use workgraph.form_context task to not get this error. Not sure if that is to be expected or a bug?

Hi @ahkole , thanks for providing a clear description of the issues.

Indeed, there was a bug affecting the {{name}} syntax with StructureData. I’ve just released version 0.4.3, which addresses this issue. Now, it should work as expected. Additionally, the IndexError: list index out of range error has been fixed in this update.

We’re aware that some error messages in WorkGraph aren’t as clear as they should be, and we plan to refactor error messages soon.

As for your workflow setup, I think it is on the right track with the While task and context usage. Please give the {{name}} syntax another try with the updated version. If you have more questions, please reach out.

1 Like

Hi @Xing , thanks for providing a fix so quickly! I’ll update my aiida-workgraph and will try to submit the workgraph again tomorrow. I’ll keep you updated.

Hi @Xing ,

I tried the new version and it progresses further but I am still running into issues. This time I get an error when it tries to launch the first DFT calculation. It excepts with the following error:

ValueError: Error occurred validating port 'inputs': mismatch between defined pseudos/ions and the list of kinds of the structure
 pseudos/ions:  
 kinds(including ghosts): Nb, Se

This seems to suggest that I am not specifying anything for the pseudos/ions but as you can see in the workgraph I listed above I do explicitly set the pseudos input port of both DFT calculations. I also verified that the kinds are correct in the input I am using.

I read in the docs (Port (Socket) - AiiDA WorkGraph documentation) that data validation is still experimental (although this is about calcfunctions). I don’t know if this might be related? Or maybe it’s related to getting the structure from the context? Maybe then the validation happens before the pseudos is properly set or something? Any idead what might be going wrong?

Maybe it helps, but I added a print statement in aiida_workgraph/engine/workgraph.py", line 1115, in run_tasks, just before running process = self.submit(executor, **kwargs) to check the contents of kwargs that the process is being submitted with. The contents are,

metadata: {'call_link_label': 'relax_cell'}
max_iterations: uuid: 5ded8903-09c5-437e-999c-dbf6b1bc8972 (pk: 19261) value: 5
clean_workdir: uuid: 50fb8ed3-cc5e-4cc0-843e-687704ad3e44 (pk: 91) value: False
monitors: {}
kpoints: uuid: d648e9a9-ea2b-4901-b0dc-de0b4c69411d (pk: 18557)
basis: uuid: 3fdce681-a132-4654-b191-2c4ac9b894cf (pk: 18490)
parent_calc_folder: uuid: f2fb8497-abf6-429e-b0b6-4355ae92a201 (pk: 18762)
pseudos: {'Nb': <PsmlData: uuid: b53fd3af-15d3-4526-b370-c16233599d06 (pk: 18534)>, 'Se': <PsmlData: uuid: bed001bb-f424-4530-ad0e-2d0b105c815b (pk: 18535)>}
ions: {}
lua: {}
code: Remote code 'siesta-e31cf1df2433297da8e15139a1926fd249cdac5a-xml-v1' on Snellius pk: 18522, uuid: 64bf28d5-147f-4a91-920b-f1ff63d8df32
structure: uuid: 9a7ba7ce-180a-43f3-a69a-a7ce81219c98 (pk: 18564)
parameters: uuid: eea45ad8-93e0-4476-912a-88158fbbe4c1 (pk: 19262)
options: uuid: c4aac870-aaaf-4e55-bce8-2ce39ca6d2f9 (pk: 18974)

To me this looks fine. Both structure and pseudos seem to be set to what I expect. So don’t know what is going wrong. I don’t know if the executor that is used here is wrong? Not sure what the executor is.

Hi @ahkole , good to know the previous fix works.

The validation error message is from this line in the aiida_siesta_plugin.

Apart the error, is there any other warning message? Could you share the verdi process report pk? Or, could you print the value and quantity after this line.

From the kwargs output you printed out:

pseudos: {'Nb': <PsmlData: uuid: b53fd3af-15d3-4526-b370-c16233599d06 (pk: 18534)>, 'Se': <PsmlData: uuid: bed001bb-f424-4530-ad0e-2d0b105c815b (pk: 18535)>}
ions: {}

I probably guess the reason. In SiestaCalculation, the ions input is a namespace, and in WorkGraph, namespaces are set with a default empty dict ({}).

When aiida_siesta_plugin validates the inputs, it treats ions as an input even if it’s an empty dict ({}). This causes the quantity to use ions and ignore the pseudos dict, which is the one you intend to use.

To resolve this, you can explicitly set ions to None, which will prevent it from being processed as an input inside WorkGraph.

relax_cell.set({
    'pseudos': node1.inputs.pseudos,
    'ions': None,  # Set to None explicity
    # other parameters...
})

You were right! Setting the ions input explicitly to None indeed fixed the problem. Thanks for coming up with a solution so quickly! The first DFT calculations is now actually running :slight_smile: Let’s hope the rest goes smoothly.

Maybe I can change the input validation in SiestaCalculation to actually check that the ions input namespace is not empty and default back to the pseudos if it is, to make it more robust to these kind of inputs.

Is there a reason btw for setting namespaces by default to an empty dict? Are plugins supposed to be able to handle that properly?

I don’t recall the exact reason to setting the default as {} :joy: . Maybe to mimic the builder, e.g.,

In [1]: code = load_code("qe-7.2-pw@localhost")

In [2]: code.get_builder()
Out[2]: 
Process class: PwCalculation
Inputs:
code: qe-7.2-pw@localhost
metadata:
  options:
    stash: {}
monitors: {}
pseudos: {}

But in principle, setting the default to None would be safer, as in your case with the SiestaCalculation. I will think about and test it, and consider changing it to None.

This problem sounds very familiar and it is something I had to deal with a long time ago. The problem comes from how plumpy.PortNamespaces work. It automatically assigns an empty dict to a port namespace even if no explicit port values are assigned. Since the ProcessBuilder is simply an extension of a PortNamespace it also has this behavior where if you instantiate one, all portnamespaces always have an empty dict, even if not explicitly set.

This would result into problems for workchains whose logic depended on a port namespace being defined or not. An example was the PwBandsWorkChain where the relax part was optional, and would be skipped as long as the relax subnamespace was not defined. But it would always be defined with at least an empty dict, in the case of the ProcessBuilder.

I “solved” this by adding an argument prune, to the private builder method ProcessBuilder._inputs. When set to True, it would remove any (nested) dictionaries that are completely empty. You can see how it is called here in instatiate_process:

Not sure how aiida-workgraph is launching processes, but maybe it is not using this?

Hi @sphuber , thanks very much for the information. WorkGraph does not prune the empty dictionary. But I think it would be good to use the same logic as the instatiate_process to make them more consistent.

Do you remember if pruning the empty dict has any downside? And should the None value also be pruned?

Since it is a side-effect of the ProcessBuilder I think it is justified to prune completely empty dicts in that case. Not sure when the inputs is just a dictionary already, because there the user may actually have intended an empty dictionary for some key. Anyway, the prune logic has been in aiida-core for a long time now and there have never been any problems as far as I am aware. And I think it would have prevented the issue described in this post (not 100% sure, but would be interesting to try). So I think it makes sense to use it in aiida-workgraph

Hi @ahkole , I hope you’re making progress with implementing the WorkGraph for your workflow. Since this is an interesting use case for the WorkGraph, so for reference and further discussion, I’ve provided an example version below.

WorkGraph

from aiida_workgraph import WorkGraph, task
from aiida_siesta.workflows.base import SiestaBaseWorkChain
from aiida import orm

@task.calcfunction()
def structure_not_converged(forces_and_stress=None, forces_threshold=4e-2, stress_threshold=1e-3):
    """
    Check if the structure's forces and stress have converged below specified thresholds.
    """
    import numpy as np
    if forces_and_stress is not None:
        forces_conv = np.all(np.abs(forces_and_stress.get_array('forces')) < forces_threshold.value)
        stress_conv = np.all(np.abs(forces_and_stress.get_array('stress')) < stress_threshold.value)
        return orm.Bool(not (forces_conv and stress_conv))  # Returns True if not converged
    else:
        return orm.Bool(True)  # If no forces/stress data, assume not converged

@task.graph_builder()
def relax_cell_atoms_workgraph(structure = None):
    """
    WorkGraph to iteratively relax both the cell and atomic positions of a structure.
    
    This workflow performs two alternating relaxation steps:
    - Cell relaxation, keeping atomic positions fixed.
    - Atomic position relaxation, keeping the cell fixed.
    If forces and stress are above defined thresholds, the workflow repeats the relaxation process.

    :param structure: Initial structure to be relaxed.
    :return: A WorkGraph object defining the iterative relaxation process.
    """
    wg = WorkGraph("cell_atoms_relax")
    
    # Initialize context to store intermediate results and structure states
    wg.context = {
        'current_structure': structure,
        'forces_and_stress': None,
        'remote_folder': None,
    }
    
    # Define a task to check if the structure has converged based on forces and stress
    not_converged = wg.add_task(structure_not_converged, name="structure_not_converged",
                                forces_and_stress='{{forces_and_stress}}')
    
    # Define a While loop that repeats relaxation tasks until convergence or max iterations
    while1 = wg.add_task("While", name="while", max_iterations=5,
                         conditions=not_converged.outputs["result"])
    
    # Task to relax the cell (keeping atomic positions fixed)
    relax_cell = wg.add_task(SiestaBaseWorkChain, name="relax_cell",
                             structure="{{current_structure}}",
                             parent_calc_folder="{{remote_folder}}")
    
    # Task to relax atomic positions (keeping cell fixed), using the structure output of relax_cell
    relax_atoms = wg.add_task(SiestaBaseWorkChain, name="relax_atoms",
                              structure=relax_cell.outputs["output_structure"],
                              parent_calc_folder=relax_cell.outputs["remote_folder"])
    
    # Update context with the outputs from the relax_atoms task for the next iteration
    relax_atoms.set_context({
        'output_structure': 'current_structure',
        'remote_folder': 'remote_folder',
        'forces_and_stress': 'forces_and_stress',
    })
    
    # Add relax_cell and relax_atoms tasks as children of the While loop to form the iteration sequence
    while1.children.add(["relax_cell", "relax_atoms"])
    
    return wg

Visualization

To better understand the logic of the WorkGraph, the user (also developer) can visualize it in a Jupyter notebook using the following code:

wg = relax_cell_atoms_workgraph()
wg

Example usage

For testing purposes, here is an example input setup. (Note: This is only an illustration; specific input parameters may need adjustment.)

from ase.build import bulk
from aiida import orm, load_profile
from aiida.orm import KpointsData


load_profile()

si = bulk('Si')
si.cell = si.cell * [0.9, 1.0, 1.1]
si.rattle(0.1)

structure = orm.StructureData(ase=si)
code = orm.load_code("siesta-5.2.0@localhost")

parameters_relax_atoms = orm.Dict(dict={
  "mesh-cutoff": "200 Ry",
  "dm-tolerance": "0.0001",
  "md-max-force-tol": "0.04 eV/ang",
  "md-max-stress-tol": "0.1 GPa",
  "md-num-cg-steps": "100",
  "md-type-of-run": "cg",
})

parameters_relax_cell = orm.Dict(dict={
  "mesh-cutoff": "200 Ry",
  "dm-tolerance": "0.0001",
  "md-max-force-tol": "0.04 eV/ang",
  "md-max-stress-tol": "0.1 GPa",
  "md-num-cg-steps": "100",
  "md-type-of-run": "cg",
  "md-variable-cell": True,
  "%block GeometryConstraints":
    """
    position from 1 to 2
    %endblock GeometryConstraints""",
})

options = {
    "resources": {
        "num_machines": 1,
        "num_mpiprocs_per_machine": 1,
    },
    "max_wallclock_seconds": 1800,
    # "withmpi": True,
}


pseudo_family = orm.load_group(label="PseudoDojo/0.4/PBE/SR/standard/psml")
pseudos = pseudo_family.get_pseudos(structure=structure)

kpoints=KpointsData()
kpoints.set_kpoints_mesh([5,5,5])

wg = relax_cell_atoms_workgraph(structure)
wg.tasks["relax_cell"].set({"pseudos": pseudos,
                            "code": code,
                            "kpoints": kpoints,
                            "parameters": parameters_relax_cell,
                            "options": options})
wg.tasks["relax_atoms"].set({"pseudos": pseudos,
                            "code": code,
                            "kpoints": kpoints,
                            "parameters": parameters_relax_atoms,
                            "options": options})
wg.run()
# wg.submit()

Let me know if you have any questions!

Hi @Xing ,

Yes, I have been able to get the workflow working as I intended. Thanks again for spotting and fixing the issues so quickly!

Thanks also for giving the example. Good to see how one can do it with the graph_builder.

One small question about the visualization. Since the context is used for passing the forces_and_stresses to the structure_not_converged and for passing the output structure of relax_atoms back into relax_cell if necessary, you don’t see these links in the visualized graph. Are there any plans of visualizing data passed through the context as well? Or would that be impossible (too difficult)?

And maybe adding to this: is data provenance for data passed through the context stored in the AiiDA provenance graph?

@Xing . Actually, upon closer inspection it appears that the workflow didn’t fully work as I intended. In the inputs for the relax_atoms task I also specified two input ports in the lua namespace (lua.md_run and lua.script). However, when inspecting the called SiestaBaseWorkChain the lua namespace actually does not contain the values that I set here. Does the . syntax not work for setting input ports inside a namespace when using the set function from aiida-workgraph? If not, how are you supposed to properly set those inputs?

Hi @ahkole , yes, we don’t support the visualization of data from context for the moment because it will create a cyclic graph in some cases (e.g., in a while loop), which may make it difficult to know where the data comes from.

is data provenance for data passed through the context stored in the AiiDA provenance graph?

Yes.

Does the . syntax not work for setting input ports inside a namespace when using the set function from aiida-workgraph ?

The set function can handle the port with .. I did a quick test using the example workgraph,

wg.tasks["relax_cell"].set({"pseudos": pseudos,
                            "code": code,
                            'lua.md_run': orm.Bool(True),
                            "kpoints": kpoints,
                            "parameters": parameters_relax_cell,
                            "options": options})

and here are the inputs of the process. you can see the input value for lua.md_run is used by the WorkChain.

$ verdi process show 633
Property     Value
-----------  ------------------------------------
type         SiestaBaseWorkChain
state        Waiting
pk           633
uuid         12c4c125-5712-4e9f-adea-3c4acf5af52d
label        relax_cell
description
ctime        2024-11-19 15:21:43.745556+00:00
mtime        2024-11-19 15:21:45.746231+00:00

Inputs          PK    Type
--------------  ----  -------------
lua
    md_run      632   Bool   ### Here you see the value is used.
pseudos
    Si          69    PsmlData
clean_workdir   631   Bool

You can run a quick check by inspecting the value of the input

print(relax_atoms.inputs["lua.md_run"].value)

also inspect the inputs of the WorkChain using:

verdi process show pk

@Xing, could you also try an example where you set lua.md_run to False? md_run is by default always set to True even if you don’t specify it so this example doesn’t really show if it works or not.

Hi @ahkole, Thanks for pointing this out. Yes, this is indeed a bug that was introduced recently. I made a PR to fix it. I will release a new version soon and keep you posted.

Hi @Xing , good to hear that you found the bug! I am looking forward to the new version.