Hi,
I have a couple of questions concerning the usage of with_mpi in CalcJobs.
More precisely, I am referring to the Wannier90 and QE plugins, but it seems that this behavior is fundamental and I am not 100% sure whether it is intended like this. In case it is, I would be really interested in the line of reasoning.
There are 3 different levels of specifying withmpi. In the QE plugin, withmpi is set to True per default but submitting a PwCalculation shows that it is submitted without the usage of MPI. A small example to reproduce the behavior:
b = PwCalculation.get_builder()
b.code = code
b.structure = structure
kpoints = KpointsData()
kpoints.set_kpoints_mesh([4, 4, 4])
b.kpoints = kpoints
b.metadata.options.resources = {"num_machines": 1, "num_mpiprocs_per_machine": 4}
b.parameters = pw.parameters
b.pseudos = pseudos
submit(b)
In case one adds the line b.metadata.options.withmpi = True, MPI is used correctly. withmpi isnāt specified on the code level, so this isnāt an issue of overwriting the different levels.
On the other hand, one doesnāt observe this behavior when using the PwBaseWorkChain, in that case, the builder correctly recognizes the default (True) value for withmpi.
I followed the different parts of the source code, ending up in the plumpy package and my naive understanding is the following:
The CalchJob looks for withmpi in the raw_inputs (https://github.com/aiidateam/aiida-core/blob/ec64780c206cdb040eee740b17865e6f0ff81cd8/aiida/engine/processes/calcjobs/calcjob.py#L925C31-L925C42) and I would expect that this is populated by the default value which is specified in the QE plugin.
If this would not be the case, and since withmpi is neither specified in the CodeInfo nor on the code level, all three levels would be None and the default of the aiida-core CalcJob implementation, which is False, would be used (this seems to be what is happening in the example). When one checks the _inputs method of the builder, one observes that withmpi isnāt returned, in case it isnāt explicitly specified. It seems that these inputs would be included in the raw_inputs and therefore it makes somehow sense why it works when a WorkChain is used. Please correct me, if Iām wrong.
Although it seems understandable, why this happening wrt. the code, Iām not 100% sure whether this is the intended behavior.
As a user, one would expect that the default value of the CalcJob is used.
Thanks a lot in advance!