Hi all!
I encountered the unexpected behavior that a WorkChain
somehow continues with the next outline step before the previous one has finished. I’m using the CommonRelaxWorkChain
of the aiida-common-workflows package (aiida-common-workflows/src/aiida_common_workflows/workflows/relax/workchain.py at master · aiidateam/aiida-common-workflows · GitHub).
This is the part where I observed the issue:
def run_workchain(self):
"""Run the wrapped workchain."""
inputs = self.exposed_inputs(self._process_class)
return ToContext(workchain=self.submit(self._process_class, **inputs))
def inspect_workchain(self):
"""Inspect the terminated workchain."""
cls = self._process_class.__name__
if not self.ctx.workchain.is_finished_ok:
exit_status = self.ctx.workchain.exit_status
self.report(f'{cls}<{self.ctx.workchain.pk}> failed with exit status {exit_status}.')
return self.exit_codes.ERROR_SUB_PROCESS_FAILED.format(cls=cls, exit_status=exit_status)
self.report(f'{cls}<{self.ctx.workchain.pk}> finished successfully.')
The WC fails stating that the subprocess failed (originating from the inspect_workchain
step). Interestingly, the subprocess actually finishes successfully, i.e. exit_status
0. Therefore, the if
statement should never be triggered, at least to my understanding.
verdi process status 13594
QuantumEspressoCommonRelaxWorkChain<13594> Finished [400] [1:inspect_workchain]
└── PwRelaxWorkChain<13602> Finished [0] [3:results]
└── PwBaseWorkChain<13606> Finished [0] [3:results]
└── PwCalculation<13613> Finished [0]
I inserted some report
statements after cls = self._process_class.__name__
, and indeed, self.ctx.workchain.is_finished_ok
is False
as well as self.ctx.workchain.is_sealed
. Moreover, inspecting the attributes
of the subprocess (by inserting another report statement before the if block), those do still contain a checkpoint
which I haven’t seen before for finished processes. I don’t post the attributes
output here right now, as it is quite lengthy. In case it is required, just let me know.
I’m using AiiDA 2.6.2. Moreover, it seems that this doesn’t happen for a single worker, but only when I have multiple workers.