Calculations get stuck in "created" state

Dear all,
In the last few weeks I’m struggling a lot with calculations that get stuck in created state even for days. Sometimes after restarting the daemon and/or repairing processes they start again, many other times nothing changes and I should delete the calculation and launch it again. I was using aiida-core 2.6.1 and I tried updating it to 2.6.2, but still the same problem.
Do you have suggestions?
Thank you.

Hi @Davide_Bidoggia , could you please provide some more information. What is the output of verdi status?

Hi @sphuber, this is the output:

✔ version:     AiiDA v2.6.2
 ✔ config:      /home/bidoggia/.aiida
 ✔ profile:     bidoggia
 ✔ storage:     Storage for 'bidoggia' [open] @ postgresql://aiida_qs_bidoggia_1a762e6ea970b7891456fe9e1265d632:***@localhost:5432/bidoggia_bidoggia_1a762e6ea970b7891456fe9e1265d632 / DiskObjectStoreRepository: caa1c1ad4ce44ccfb08f0ca4f0b3cdda | /home/bidoggia/.aiida/repository/bidoggia/container
 ✔ broker:      RabbitMQ v3.9.13 @ amqp://guest:guest@127.0.0.1:5672?heartbeat=600
 ✔ daemon:      Daemon is running with PID 1137937

Since you are using RabbitMQ v3.9.13, did you follow the instructions to configure the consumer_timeout?

This is important for AiiDA to function properly, although it should not really explain the behavior you are describing. What would be very useful is some more information when you get a new process that you launch and gets struck in created. In that case, please check the daemon logs. It should be in ~/.aiida/daemon/log/aiida-{profile-name}.log. Search it for the pk of the stuck process. Also check the output of verdi process report and verdi node attributes. Also stop the daemon and run verdi process repair --dry-run just to confirm the task is actually missing.

Also, maybe there is just a problem with your daemon. Please check the ~/.aiida/daemon/log/circus-{profile-name}.log. Maybe your daemon workers are getting killed often and are restarted?

Thank you @sphuber. In last period I had even more difficulties, no more calculations get stuck in created state, but mostly on running state between one method and the next one of those specified in the spec.outline of my workflow (workflow that I run different times without problems with both aiida-core 2.5.1 and 2.6.0). The previous method seems to be finished properly but the next one do not start, if I restart the daemon the previous method is run again and then it gets stuck.
Furthermore I started encountering also problems in exposing outputs (not for all runs, but running many times it became more and more frequent): for example pwcalculation (but it happens also with other plugins) has finished properly with all outputs, but PwBaseWorkChain complains saying:

2024-09-15 12:27:36 [199230 | REPORT]: [326964|PwBaseWorkChain|_attach_outputs]: required output `output_parameters` was not an output of PwCalculation<326973> (or an incorrect class/output is being exposed).

Looking at the log files you suggested I could not find anything interesting.

I thought my installation was somehow corrupted so, after a backup, I uninstalled both aiida, postgresql and rabbitmq and installed them again. I tried both aiida-core 2.5.1 and 2.6.2 and I still have the same problem with exposed outputs. For the tests I did it is no more getting stucked in created or running.

This is the current output of verdi status:

 ✔ version:     AiiDA v2.6.2
 ✔ config:      /home/bidoggia/.aiida
 ✔ profile:     bidoggia
 ✔ storage:     Storage for 'bidoggia' [open] @ postgresql://bidoggia:***@localhost:5432/aiida_db_bidoggia2 / DiskObjectStoreRepository: caa1c1ad4ce44ccfb08f0ca4f0b3cdda | /media/bidoggia/aiida/repository/bidoggia/container
 ✔ broker:      RabbitMQ v3.13.7 @ amqp://guest:guest@127.0.0.1:5672?heartbeat=600
/home/bidoggia/py_envs/aiida/lib/python3.10/site-packages/paramiko/pkey.py:82: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0.
  "cipher": algorithms.TripleDES,
/home/bidoggia/py_envs/aiida/lib/python3.10/site-packages/paramiko/transport.py:253: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0.
  "class": algorithms.TripleDES,
 ✔ daemon:      Daemon is running with PID 10120

Thank you for your help!

Thanks @Davide_Bidoggia . Sorry to hear you are still experiencing issues. So to summarize, after your reinstall you:

  • no longer have the original problem with calculations that stall
  • now only experience problems with the BaseRestartWorkchain functionality

Correct?

Is it possible that the workchains that have problems with the attaching of outputs were launched with an older version of aiida-core (i.e. before you reinstalled everything) and continued after the clean install? Does it happen at all for any new workchains that you run now?

To try and diagnose further, could you share the verdi process report of a workchain that failed with that error?