Is AiiDA 2.6 slower that 2.5 using both PostgreSQL?

So the title says all basically.

I am moving now to AiiDA 2.6 after using 2.5 for a while. I have installed it using PostgreSQL to maintain as much similarity with 2.5.

While running several workchains simulatenously, I have noticed that the actual calculation takes a lot to start. A time of the order of minutes. I mean, it takes a while to go from the status of running to the actual start of the calculation (I was checking all the processes using top).

Is this normal? Is AiiDA 2.6 slower than previous versions? or is it my own perception or maybe some issue I have in my computer?

Thanks in advance

Jaime

Hi Jaime, there should in principle be no difference, at least there is nothing immediately that springs too mind. The new options to run a profile without PostgreSQL is purely additive and should not affect PSQL profiles whatsoever.

Could you provide a bit more information about what kind of calculations you are running. These are CalcJobs I take it? Are you submitting them to your localhost or some remote computer? Is there a scheduler involved or just the core.direct scheduler? What do you mean with “running” and “actual start of the calculation”. What are the exact statuses shown by verdi process list?

1 Like

Hi, Sebastiaan. Thanks for your answer and sorry for my lack of detail in my post. Now that I read it again I can tell that what I say is not as precise as it should be.

Yeah sure. I am running the DFT package Siesta with the AiiDA-plugin AiiDA-Siesta available in the plugin repository.

My localhost

Just the core.direct scheduler.

Sorry, I should have been more clear.
What I mean is that, for example, I feel that the workchain remains a lot of time in the created state. Maybe like some tens of seconds. I remark that it is just my feeling, I have not made any rigorous quantitave comparison, but I in my mind it was faster in the computer I have with AiiDA 2.5.0. I also noticed that the workchain enterd the Waiting Monitoring scheduler: job state RUNNING state but I was simultaneously checking the top screen and I did not see any siesta process for maybe 20 seconds or so.

I do not know if I should be worried: I mean, everything works fine, all workchains end up being submitted and run correctly and that time to wait until the actual siesta process starts is not a big deal. But, not gonna lie, I am a bit paranoid and I am afraid when I see some behaviour that I feel like unusual. In particular I am afraid of having issues with the daemon not being able to track the actual status of the process or something similar.

I want to remark again that all I say is more of a feeling than an actual rigorous observation, but in my AiiDA 2.5.0 workchains get in the Running state almost instantaneoulsy.

Thanks for the additional detail. Could you please report the output of verdi status?

1 Like
âś” version:     AiiDA v2.6.1
 âś” config:      /home/ICN2/jgarridoa/.aiida
 âś” profile:     presto
 âś” storage:     Storage for 'presto' [open] @ postgresql://aiida-presto_2:***@localhost:5432/aiida-presto / DiskObjectStoreRepository: c613722add2447e9ab6163be1da50d76 | /home/ICN2/jgarridoa/.aiida/repository/presto/container
 âś” broker:      RabbitMQ v3.12.13 @ amqp://guest:guest@127.0.0.1:5672?heartbeat=600
 âś” daemon:      Daemon is running with PID 1738209

Thanks. My suspicion was that you may have configured the profile to run without RabbitMQ which would have explained the slowness as in that case AiiDA has to fall-back on a polling-based mechanism for moving processes forward. But that is clearly not the case.

Then there are two more settings to check. Please check the output of

verdi computer configure show localhost

assuming that you named your localhost computer localhost. Otherwise, replace that with the actual label of the Computer you are using. It should print a safe_interval value that should be pretty small.

And then open a verdi shell and print the output of

load_computer('localhost').get_minimum_job_poll_interval()

again replacing localhost if your Computer has a different label.

* use_login_shell  -
* safe_interval    0

It is exactly zero in my case.

My output is 1:

In [1]: load_computer('localhost').get_minimum_job_poll_interval()
Out[1]: 1

Hmm, those look fine and can then also not explain the slow down. Not sure what else to check other than perhaps increasing the log level with verdi config set logging.aiida_loglevel info, stopping the daemon and then running verdi daemon worker. This will launch a single daemon worker in your shell in a blocking way. If you then submit one of your workchains from another terminal, it will start running in that blocking shell and you should get the output with more verbose log messages. If you could share that, we might be able to see exactly where most time is spent.

Before trying what you suggest, let me ask a question.

Previously, I said:

What I mean is that, for example, I feel that the workchain remains a lot of time in the created state. Maybe like some tens of seconds. I remark that it is just my feeling, I have not made any rigorous quantitave comparison, but I in my mind it was faster in the computer I have with AiiDA 2.5.0. I also noticed that the workchain enterd the Waiting Monitoring scheduler: job state RUNNING state but I was simultaneously checking the top screen and I did not see any siesta process for maybe 20 seconds or so.

Is this behaviour normal if there have been a lot of workchains (60) submitted in a matter of seconds?

Just for testing purposes, I have submitted a single workchain and I have observed what I consider usual behaviour: I have seen the siesta process almost inmediately after its corresponding workchain enters the state Waiting Monitoring scheduler: job state RUNNING. So for this case, everything goes nicely.

Nevertheless, the case I described initially applied to a situation in which I was submitting 60 workchains through a python script. Is that amount of workchains relevant for this topic?

I apologize for omitting this detail before. I thought it was not relevant because I have already done the same thing in the past without the issue I reported in this topic, but it is also worth mentioning that I never submitted as many as 60. I usually submitted 12 at most.

If you are submitting 60 workchains what probably is happening is that the daemon worker (are you running 1 or multiple) will first start running all of the workchains and their first steps before it gets to the their subprocesses, the Siesta calcjobs that you are monitoring. But I take it that at some point all those siesta jobs start running quite quickly one after the other.

I was using two daemon workers.

Okey, if that is the expected behaviour, then what I report as an issue, is not. My bad for assuming that the expected behaviour was running the first steps + initializing subprocesses before proceeding with the rest of the workchains.

With these clarifications, I think my topic is now solved.

Do you want me still to check something in particular?

If not, you can consider this topic as closed.

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.