Processes get expected due to memory error from plumpy

Hello everyone,
I am using aiida-core v2.4.0 and plumpy ~0.22. I have a very large database (~5M nodes) with a PostgreSQL backend. I keep getting the following error for some random process:

2025-06-07 14:45:44 [1897912 | ERROR]: Traceback (most recent call last):
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/aiida/manage/external/rmq/launcher.py", line 88, in _continue
    result = await super()._continue(communicator, pid, nowait, tag)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/process_comms.py", line 604, in _continue
    proc = cast('Process', saved_state.unbundle(self._load_context))
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/persistence.py", line 58, in unbundle
    return Savable.load(self, load_context)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/persistence.py", line 453, in load
    return load_cls.recreate_from(saved_state, load_context)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/processes.py", line 249, in recreate_from
    process = cast(Process, super().recreate_from(saved_state, load_context))
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/persistence.py", line 478, in recreate_from
    call_with_super_check(obj.load_instance_state, saved_state, load_context)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/base/utils.py", line 31, in call_with_super_check
    wrapped(*args, **kwargs)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/aiida/engine/processes/workchains/workchain.py", line 169, in load_instance_state
    super().load_instance_state(saved_state, load_context)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/aiida/engine/processes/process.py", line 319, in load_instance_state
    super().load_instance_state(saved_state, load_context)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/processes.py", line 636, in load_instance_state
    super().load_instance_state(saved_state, load_context)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/base/utils.py", line 16, in wrapper
    wrapped(self, *args, **kwargs)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/persistence.py", line 485, in load_instance_state
    self.load_members(self._auto_persist, saved_state, load_context)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/persistence.py", line 536, in load_members
    setattr(self, member, self._get_value(saved_state, member, load_context))
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/persistence.py", line 591, in _get_value
    value = Savable.load(value, load_context)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/persistence.py", line 453, in load
    return load_cls.recreate_from(saved_state, load_context)
  File "/home/bastonero/.conda/envs/aiida/lib/python3.9/site-packages/plumpy/persistence.py", line 630, in recreate_from
    obj = cls(loop=loop)
MemoryError

Has this been noticed and/or fixed yet?

AFAIK we did not fix any MemoryError regarding plumpy. In v2.7.0 (release next week) we fixed some issues with the disk-objectstore that did not release the file descriptors but I doubt that this is the issue. I find it odd that the persistence module raises this error since this one should not be memory heavy. Also your database size should not be really relevant. Is this reproducible on this site with the same error traceback? I am sorry to be not more helpful, but could you provide more information so we can try to reproduce it.

Thanks for the help! I know it is a rather mysterious error, as I also have never seen it in 5 years of AiiDA usage. Unfortunately it is not reproducible. Somehow I reduced the worklaod and now everything is fine. It started happening when using a script where I have many submit and then waiting for processes to finish, instead of a proper workchain. Nevertheless, still weird, but i’ll keep you posted if it happens again, so i can try to figure out what it triggers it. Thanks again!