Caching behaviour on imported archives

Federico_Orlando · December 2, 2024, 11:11am

Hi everyone

I have an external .aiida archive containing a bunch of calculations done - by other people - with a workchain I wrote, which I then found to have a bug. Since such bug manifested itself only seldomly in the practice, a large part of these calculations gave valid results and only a small fraction must be repeated. My idea was then to import the archive and relaunch everything with the debugged workchain; having enabled caching on my profile, I was expecting that
i) once imported into my database, previously external nodes are treated as if I myself produced them
ii) all calculations not affected by the bug, being provided the same input, are thus fetched from the cache instead of being actually run

To test this behaviour, I created on purpose an .aiida archive with only one call of the workchain (the last I had submitted myself), removing the corresponding node from the process list, and imported it back (node 47304 below). Since I hadn’t changed the workchain in the meantime, I then tried to resubmit it (47370) expecting the caching mechanism to take care of it. Yet, the calculation called by the workchain was run from scratch. At this point, I tried to run it once again (47393) and this time the calculation was correctly picked up from the cache.

47304 2D ago AmarantaWorkChain Finished [11]
47309 2D ago run_ase_calculation Finished [0]
47370 33m ago AmarantaWorkChain Finished [11]
47372 33m ago run_ase_calculation Finished [0]
47393 4s ago AmarantaWorkChain Finished [11]
47395 3s ago run_ase_calculation Finished [0]

I then checked the hash and is_valid_cache attributes for the three calculation nodes:

In [1]: m,n,o = load_node(47309), load_node(47372), load_node(47395)
In [2]: m.base.caching.get_hash()
Out[2]: ‘18ee3e98b06fa157428b9595ce8ab53ac5f03b2f6cfa4f1bbfee1c77c7b0c693’
In [3]: n.base.caching.get_hash()
Out[3]: ‘18ee3e98b06fa157428b9595ce8ab53ac5f03b2f6cfa4f1bbfee1c77c7b0c693’
In [4]: o.base.caching.get_hash()
Out[4]: ‘18ee3e98b06fa157428b9595ce8ab53ac5f03b2f6cfa4f1bbfee1c77c7b0c693’
In [5]: m.base.caching.is_valid_cache
Out[5]: True
In [6]: n.base.caching.is_valid_cache
Out[6]: True
In [7]: o.base.caching.is_valid_cache
Out[7]: True

To my surprise, each of them – and the imported one 47309, specifically – is valid cache indeed and the hashes are the same, yet the imported nodes are not correctly picked up. Do you have any idea of why?

(In passing: the workchain - still bugged - is running in “safe mode”, just executing the first calculation, and this is why its status is Finished [11] (most required output is not produced). Still, the caching mechanism holds for calculations only and here the calculations finished correctly [0] so I’m assuming this is not an issue)

Thanks a lot for your help
Best regards
Federico Orlando

P.S. at present, the workchain (still bugged) is running in “safe mode”, just executing the first calculation, and that is why the status is Finished [11] (most required output is not produced). Stil, the caching mechanism holds for calculations only and here the calculation finished correctly [0]

danielhollas · December 2, 2024, 2:41pm

Hi Federico,

just a quick question, which aiida-core version are you using? If you’re not using the latest 2.6.x version, please try to upgrade since there have been quite a few changes and improvements to the caching mechanism.
Also note that when upgrading from earlier aiida versions, you will have to recalculate the hashes. The same probably applies if you’ll import an archive where some results were calculated with older aiida version

verdi node rehash

See changelog for more details:

github.com

aiidateam/aiida-core/blob/main/CHANGELOG.md#improvements-and-changes-to-caching

# Changelog

## v2.6.3 - 2024-11-6

### Fixes
- CLI: Fix exception for `verdi plugin list` (#6560) [[c3b10b7]](https://github.com/aiidateam/aiida-core/commit/c3b10b759a9cd062800ef120591d5c7fd0ae4ee7)
- `DirectScheduler`: Ensure killing child processes (#6572) [[fddffca]](https://github.com/aiidateam/aiida-core/commit/fddffca67b4f7e3b76b19df7db8e1511c449d2d9)   
- Engine: Fix state change broadcast before process node is updated  (#6580) [[867353c]](https://github.com/aiidateam/aiida-core/commit/867353c415c61d94a2427d5225dd5224a1b95fb9)

### Devops
- Docker: Replace sleep with `s6-notifyoncheck` (#6475) [[9579378b]](https://github.com/aiidateam/aiida-core/commit/9579378ba063237baa5b73380eb8e9f0a28529ee)
- Fix failed docker CI using more reasoning grep regex to parse python version (#6581) [[332a4a91]](https://github.com/aiidateam/aiida-core/commit/332a4a915771afedcb144463b012558e4669e529)
- DevOps: Fix json query in reading the docker names to filter out fields not starting with aiida (#6573) [[e1467edc]](https://github.com/aiidateam/aiida-core/commit/e1467edca902867e53605e0e60b67f8767bf8d3e)


## v2.6.2 - 2024-08-07

### Fixes
- `LocalTransport`: Fix typo for `ignore_nonexisting` in `put` (#6471) [[ecda558d0]](https://github.com/aiidateam/aiida-core/commit/ecda558d08c5608880308f69a21c05fe918be89f)
- CLI: `verdi computer test` report correct failed tests (#6536) [[9c3f2bb58]](https://github.com/aiidateam/aiida-core/commit/9c3f2bb589f1a6cc920ed2fbf0627924d8fce954)

This file has been truncated. show original

I haven’t looked at the details of your report yet, it’s possible there is still a bug that is somehow specific to imported archives.

danielhollas · December 3, 2024, 5:24pm

it’s possible there is still a bug that is somehow specific to imported archives.

While looking at a piece of code related to caching, I found this comment

# computer names are changed by aiida-core if imported 
# and do not have same uuid.

It’s possible that this is (one of) reasons why caching would not work as expected for imported archives. It’s possible that this comment is outdated as it was definitely written before v2.6. I’ll dig more into this.

github.com

aiidateam/aiida-test-cache/blob/7caebdb91989e12dfb1e82995e997d5f42fcfe80/aiida_test_cache/archive_cache/_fixtures.py#L252


      
          calcjob_ignored_attributes = (
              *tuple(hash_ignore_config.get("calcjob_attributes", [])), "version"
          )
          calcjob_ignored_inputs = tuple(hash_ignore_config.get('calcjob_inputs', []))
          
          def mock_objects_to_hash_code(self):
              """
              Return a list of objects which should be included in the hash of a Code node
              """
              self = get_node_from_hash_objects_caller(self)
              # computer names are changed by aiida-core if imported and do not have same uuid.
              return [self.get_attribute(key='input_plugin')]
          
          def mock_objects_to_hash_calcjob(self):
              """
              Return a list of objects which should be included in the hash of a CalcJobNode.
              code from aiida-core, only self.computer.uuid is commented out
              """
              hash_ignored_inputs = self._hash_ignored_inputs
              self = get_node_from_hash_objects_caller(self)

sphuber · December 3, 2024, 6:31pm

The problem is most likely indeed that the computer for the new calculation does not match that of the original calculation from the archive. And since the Computer’s UUID is part of the cache:

github.com

aiidateam/aiida-core/blob/ec52f4ef321f9ef1fa24e5a3056153ea55bce7d4/src/aiida/orm/nodes/caching.py#L73


      
              """Return a list of objects which should be included in the hash."""
          
              return {
                  'class': str(self._node.__class__),
                  'attributes': {
                      key: val
                      for key, val in self._node.base.attributes.items()
                      if key not in self._node._hash_ignored_attributes and key not in self._node._updatable_attributes
                  },
                  'repository_hash': self._node.base.repository.hash(),
                  'computer_uuid': self._node.computer.uuid if self._node.computer is not None else None,
              }
          
          def get_hash(self) -> str | None:
              """Return the hash that was computed and stored for this node or ``None``.
          
              This does not recompute the hash but simply returns the hash that was computed when the node was stored. If the
              hash was reset, using :meth:`aiida.orm.nodes.caching.NodeCaching.clear_hash` for example, it will return
              ``None``.
              """
              return self._node.base.extras.get(self._HASH_EXTRA_KEY, None)

The computed hash will be different. We decided to include the computer’s UUID in the hash, because in principle changing a computer can change the results of a calculation for a whole host of reasons. Different environments, different libraries etc.

Unfortunately, there is currently no way to temporarily disable this. Implementing this wouldn’t be trivial, because it wouldn’t be as simple as simply temporarily removing the computer UUID from the objects that are used in computing the hash, as that will only affect the hash of any new calculations, but the old calculations will still have the hash that was computed with the computer’s UUID. So after disabling it, you would first have to recompute the hash of all (or some target subset) of calculations. Which is not really practical. So I am not sure what pragmatic solution we could provide here, to be honest.

Topic		Replies	Views
Re-submitting workchain with AiiDA-VASP New to AiiDA question	2	86	October 26, 2023
Problem with verdi archive import General Usage	1	29	June 3, 2025
Unsual behavior in the QueryBuilder General Usage question	8	110	February 19, 2024
AiiDA workchain caching? General Usage caching	4	113	March 14, 2024
Importing QE files from my folders New to AiiDA question	17	160	September 28, 2023

Caching behaviour on imported archives

Related topics