General description of the problem
I am trying to understand AiiDA’s restart from checkpoint functionality. From the aiida core readthedocs it is hard for me to understand how this works in practice so I am trying to create a minimal example and I am having problems to make it work properly or I understand something wrong.
Steps to reproduce
The minimal example looks like this
# workchain_minimal.py
from aiida.engine import ToContext, WorkChain, calcfunction
from aiida.orm import AbstractCode, Int
from aiida.plugins.factories import CalculationFactory
from aiida.manage.configuration import get_config
from aiida.engine import submit, run
from aiida import orm, load_profile
import time
ArithmeticAddCalculation = CalculationFactory('core.arithmetic.add')
class AddThreeNumbersWorkChain(WorkChain):
@classmethod
def define(cls, spec):
super().define(spec)
spec.input('x', valid_type=Int)
spec.input('y', valid_type=Int)
spec.input('z', valid_type=Int)
spec.input('code', valid_type=AbstractCode)
spec.outline(
cls.add_xy,
cls.add_xyz,
cls.result,
)
spec.output('result', valid_type=Int)
def add_xy(self):
print("Run add_xy")
inputs = {'x': self.inputs.x, 'y': self.inputs.y,
'code': self.inputs.code}
add_xy_job_node = self.submit(ArithmeticAddCalculation, **inputs) # calc job node
return ToContext(add_xy_job_node=add_xy_job_node)
def add_xyz(self):
print("Run add_xyz")
# raise ValueError("Some bug")
inputs = {'x': self.ctx.add_xy_job_node.outputs.sum,
'y': self.inputs.z, 'code': self.inputs.code}
add_xyz_job_node = self.submit(ArithmeticAddCalculation, **inputs)
return ToContext(add_xyz_job_node=add_xyz_job_node)
def result(self):
self.out('result', self.ctx.add_xyz_job_node.outputs.sum)
and the file I use to run
# run.py
from aiida.engine import ToContext, WorkChain, calcfunction
from aiida.orm import AbstractCode, Int
from aiida.plugins.factories import CalculationFactory
from aiida.manage.configuration import get_config
from aiida.engine import submit, run
from aiida import orm, load_profile
from workchain_minimal import AddThreeNumbersWorkChain
builder = AddThreeNumbersWorkChain.get_builder()
builder.code = orm.load_code(label='add')
builder.x = orm.Int(2)
builder.y = orm.Int(3)
builder.z = orm.Int(5)
result = run(builder)
print(result)
Now in the workchain_minimal.py
in the add_xyz
function I uncommented the raising of the ValueError
. Run verdi run run.py
Now I am following the instructions from the tutorial using quantum espresso and restart the process with the script
# restart.py
from aiida.engine import run, submit
failed_calculation = load_node(<PK_OF_FAILED_PROCESS>)
restart_builder = failed_calculation.get_builder_restart()
calcjob_node = run(restart_builder)
Question
From the prints I see that the add_xy
is again executed, which I assumed would be skipped, since it worked the first time. If someone could clarify my misunderstanding of the restart from checkpoint functionality of AiiDA or explain me what I do wrong in my script, I would be very grateful.
Environment
✔ version: AiiDA v2.5.1.post0
✔ config: /home/alexgo/code/aiida-core/.aiida
✔ profile: alexgo
✔ storage: SqliteDosStorage[/home/alexgo/code/aiida-core/.aiida/repository/sqlite_dos_f275ff0f10174e8e8
4c13e82dd5ba452]: open,
✔ broker: RabbitMQ v3.8.14 @ amqp://guest:guest@127.0.0.1:5672?heartbeat=600
✔ daemon: Daemon is running with PID 608674