Unsual behavior in the QueryBuilder

Hello everyone!

I was trying to access some data with the QueryBuilder when I ran into some unexpected behavior.

What I did was to export some nodes from another profile (running in another machine aiida v 1.6.1) and then I imported it into the current profile (aiida v 2.2.2.post0)
I then tried to find the nodes, and that works okay

from aiida import orm
qb = orm.QueryBuilder()
qb.append(orm.WorkChainNode, filters={'and':[{'extras.property':{'in':['stacking_fault_energy']}}, {'attributes.process_state': {'==': 'finished'}}, {'extras': {'has_key': 'formula'}}]}, tag='workchain',
project=['uuid'])
qb.all(flat=True)
['85d22007-2aa8-4da8-a817-af98dc906621',
 '731fb5fd-478a-401d-bf2d-a95a48681e6c']

I then tried to get which code I used for this workchains and suddenly I only get one node, instead of two

from aiida import orm
qb = orm.QueryBuilder()
qb.append(orm.WorkChainNode, filters={'and':[{'extras.property':{'in':['stacking_fault_energy']}}, {'attributes.process_state': {'==': 'finished'}}, {'extras': {'has_key': 'formula'}}]}, tag='workchain',
project=['uuid'])
qb.append(orm.Code, with_outgoing='workchain', project=['uuid'])
qb.all(flat=True)

['731fb5fd-478a-401d-bf2d-a95a48681e6c',
 'f82964d2-ee74-4d9e-9088-a897884ca793']

This is unexpected, since if I check each workchain I use a different code. Even stranger if I look if there are input structures I get an empty result

from aiida import orm
qb = orm.QueryBuilder()
qb.append(orm.WorkChainNode, filters={'and':[{'extras.property':{'in':['stacking_fault_energy']}}, {'attributes.process_state': {'==': 'finished'}}, {'extras': {'has_key': 'formula'}}]}, tag='workchain',
project=['uuid'])
qb.append(orm.StructureData, with_outgoing='workchain', project=['uuid'])
qb.all(flat=True)
[]

When I inspect each node, I can see that they clearly have codes and structure nodes

node = load_node('731fb5fd-478a-401d-bf2d-a95a48681e6c')
incoming = node.get_incoming()
print(incoming.get_node_by_label('code'))
print(incoming.get_node_by_label('initial_structure'))
Remote code 'vasp-5.4.4-std-intel-hpc' on moggie (Imported #1) pk: 21736, uuid: f82964d2-ee74-4d9e-9088-a897884ca793
uuid: a09c6b09-cdd8-446d-bede-33e211450f97 (pk: 112171)

What is even stranger if I do the query in a way in which I do not ask explicitly for orm.Code or orm.StructureData but instead for nodes which have attributes that match then ones from those types of nodes, then I do get the information that I want

qb = orm.QueryBuilder()
qb.append(orm.Node, tag='structure', filters={'attributes':{'has_key':'cell'}}, project=['uuid'])
qb.append(orm.WorkChainNode, filters={'and':[{'extras.property':{'in':['stacking_fault_energy']}}, {'attributes.process_state': {'==': 'finished'}}, {'extras': {'has_key': 'formula'}}]}, tag='workchain'
, with_incoming='structure', project=['uuid'])
qb.all()

[['cee1ac28-af23-4244-8f2f-7e45002628c5',
  '85d22007-2aa8-4da8-a817-af98dc906621'],
 ['ab04a00f-6541-4621-b54f-1578ce55c871',
  '85d22007-2aa8-4da8-a817-af98dc906621'],
 ['4ca3b4eb-6156-48ff-a4a3-23b4a4e4bd74',
  '85d22007-2aa8-4da8-a817-af98dc906621'],
 ['a09c6b09-cdd8-446d-bede-33e211450f97',
  '731fb5fd-478a-401d-bf2d-a95a48681e6c']]

So this leads me to suspect that for some reason when the import was done some information about which kind of node is associated with this inputs was lost. Has someone experienced a similar behavior?

Thanks!

Hi Jonathan, just to confirm, you exported in AiiDA 1.6 and migrated the archive file to 2.x using AiiDA 2.2, and then imported there, right?
We changed the string identifying the node type adding core. Maybe there is some issue in the conversion done for the export files (the migration was implemented both for migration of the database directly, and for the migration of archive files). Can you load e.g. two Codes, one that appears in the QueryBuilder results and one that does not, and look at the node.node_type of both? My guess is that for some reason they will be different?

Hi @giovannipizzi! Yes I exported from AiiDA 1.6 and the imported in an AiiDA 2.2 instance (it took care of doing the migration from one to another when doing the import).

You are right, it seems to be that for some reason during the migration the core entry is missing

code_not_in_qb = load_node('85d22007-2aa8-4da8-a817-af98dc906621').inputs.code
code_in_qb = load_node('731fb5fd-478a-401d-bf2d-a95a48681e6c').inputs.code
print(f"Not in QB: {code_not_in_qb.node_type}")
print(f"In QB: {code_in_qb.node_type}")

Not in QB: data.code.Code.
In QB: data.core.code.Code.

The same happens to the StructureData they have as node_type 'data.structure.StructureData.' instead of 'data.core.structure.StructureData.'

Is there someway to fix this? Not entirely sure what could have happened in the migration to cause this problem.

Do you still have the 1.6 archive file? This could be useful for debugging. Also, if you still have the 1.6 profile, you could check there if the two nodes somehow differ?

In the 1.6 profile the nodes all seem to be the same

from aiida import __version__
code_not_in_qb = load_node('85d22007-2aa8-4da8-a817-af98dc906621').inputs.code
code_in_qb = load_node('731fb5fd-478a-401d-bf2d-a95a48681e6c').inputs.code
print(f"Not in QB: {code_not_in_qb.node_type}")
print(f"In QB: {code_in_qb.node_type}")
print(f"AiiDA version: {__version__}")
Not in QB: data.code.Code.
In QB: data.code.Code.
AiiDA version: 1.6.2

Which means that one of them was changed in the migration but not the other, that is quite weird. I wonder if it is an issue with already having another code with the same label there?

I can share the archive with you by mail if that is okay.

Hi Jonathan, thanks for sharing the archive with me.

Do you know if you are using a tagged version of AiiDA (1.6)? You mention 1.6.2, but your archive file says that the archive is at version 0.12, but the migration to the new data.core... names happened between 0.11 and 0.12. This is why those weren’t migrated (and probably something at a later stage created the additional codes).

And in all 1.6.x tagged versions, the version of the archive was at 0.10. So probably you’ve used some custom version in between during the alpha/beta phase of 2.0?
Maybe @sphuber remembers at which point we introuced the archive versions 0.11, 0.12, 0.13? Also, there might be some other inconsistency in your version/export file, if I migrate it from AiiDA 2.x the files are not exported but only the DB, which is weird (a properly exported file from 1.6.1 at archive version 0.10 instead properly migrates the files in the repository as well).

Hi @giovannipizzi ! Thank you for your help!
I think that might be indeed the case, this was a dev environment and i ran some calculations to test and then I wanted to move them to another env to test some other things without having to redo the calculation.

The 1.6.2 environment is actually a clone of the dev repository which for what I see if has the commit hash f2367e95595dde6fcfdd64e1971ac805764a419d

I think that was a very “development” branch between 1.x and 2.x. E.g.in that commit indeed the core part was not still included; and the commit
has 0.12 as the archive version it will produce.
But things were probably changed before making 2.0.0beta1, e.g. the migration v11-v12 in that commit became migration v10-v11 in 2.0 - I don’t remember anymore, but I think we might have decided that since it was a development branch, we didn’t have the human time to support also the intermediate changes in the development branch between 1.6 and 2.0. As I said, also other things in the export file of that version might not work correctly as the migration of files in the repository.

You could try to backup and duplicate the profile, and then migrate to 2.0, and then to 2.5 (not as an export file, but upgrading the coding and letting it migrate), but again being a development version, I’m not sure it’s going to work (hence my suggestion to have a backup!).

How critical it is for you to migrate that data? I tried to change manually the version in the metadata.json to 0.10 or 0.11 in your file, but 0.10 crashes; 0.11 kind of works, but still the files are not migrated. So I’m not sure there is a simple solution.

As a note - migrations and archives should work correctly for officially released versions - if this is not the case, this would be a bug.

Yes it is not super necessary data, as I mentioned it was some test that we ran sometime ago and we did not want to have to run it again. But I can give it a try to what you suggest to see if that makes it work.
Thanks for the help!