Hello,
I am trying to use aiida as container ( singularity ) and configuring it to be able to use our remote HPC setup. The container setup is completed successfully, however we are failing to establish ssh connection to our remote HPC from Aiida container.
Observing below error message
ValueError: The SSH proxy jump and SSH proxy command options cannot be used together
Hi @Shraddha_Kiran!
It seems you specified both the proxy_command
and proxy_jump
settings, so AiiDA isnāt sure which one to use. In general, I would recommend using proxy_jump
, see:
Could you show me the output of
verdi computer configure show <COMPUTER_LABEL>
where you have to replace <COMPUTER_LABEL>
with the label of the computer you are configuring SSH for.
Hi @mbercx
Thanks for your suggestion. I am pasting the output of
verdi computer configure show <COMPUTER_LABEL>
below:
(aiida) x0144578@dcalph078:~$ verdi computer configure show uhpc
* username x0144578
* port 22
* look_for_keys True
* key_filename /user/x0144578/rsm_ppk.ppk
* timeout 60
* allow_agent False
* proxy_jump n
* proxy_command
* compress True
* gss_auth False
* gss_kex False
* gss_deleg_creds False
* gss_host dcalph000
* load_system_host_keys True
* key_policy RejectPolicy
* use_login_shell True
* safe_interval 30.0
2023-11-14 05:10:30.170 PST [44129] LOG: unexpected EOF on client connection with an open transaction
(aiida) x0144578@dcalph078:~$ verdi computer test uhpc --print-traceback
Report: Testing computer<uhpc> for user<arunprasad_pandurangan@contractor.amat.com>...
* Opening connection... [FAILED]: Error while trying to connect to the computer
Full traceback:
Traceback (most recent call last):
File "/user/x0144578/.conda/envs/aiida/lib/python3.11/site-packages/aiida/cmdline/commands/cmd_computer.py", line 547, in computer_test
with transport:
File "/user/x0144578/.conda/envs/aiida/lib/python3.11/site-packages/aiida/transports/transport.py", line 128, in __enter__
self.open()
File "/user/x0144578/.conda/envs/aiida/lib/python3.11/site-packages/aiida/transports/plugins/ssh.py", line 459, in open
raise ValueError('The SSH proxy jump and SSH proxy command options can not be used together')
ValueError: The SSH proxy jump and SSH proxy command options can not be used together
Warning: 1 out of 0 tests failed
2023-11-14 05:11:00.892 PST [44180] LOG: unexpected EOF on client connection with an open transaction
Thanks @Shraddha_Kiran!
* proxy_jump n
* proxy_command
It seems you donāt need to use a proxy to connect to your remote HPC? I think you may have accidentally set these when configuring the SSH transport. Can you try running
verdi computer configure core.ssh uhpc
And set no value for SSH proxy jump
and SSH proxy command
when prompted using !
(an exclamation mark)? That is, just keep the same values for other settings by pressing enter, but set no value for the proxy setting with !
, as explained in the report when executing the command:
Report: enter ! to ignore the default and set no value.
The following line:
2023-11-14 05:10:30.170 PST [44129] LOG: unexpected EOF on client connection with an open transaction
Also is somewhat worrying, but I think is unrelated. This seems to be a warning from the database, but Iām not familiar with it. Maybe someone else has an idea?
Hi @mbercx
Is there any simpler way to check if aiida setup indeed can ātalkā to remote HPC?
Are you able to connect to the remote HPC from the container using just the ssh
command, i.e. not through AiiDA? In principle you should be able to configure the SSH transport for the AiiDA computer to connect to the remote as well then, unless some special settings are required that Iām not familiar with. Can you show the configuration in the ~/.ssh/config
file that you have set up?
Hello @mbercx
We tried ssh-ing to our remote HPC (dcalph000) from aiida image (aiida-core-with-services_edge.sif). Sharing the details below:
Apptainer> ssh dcalph000
Last login: Fri Nov 17 00:11:18 2023 from 10.141.1.78
Welcome to Bright release 9.0
Based on Red Hat Enterprise Linux Server 7
ID: #000002
Use the following commands to adjust your environment:
'module avail' - show available modules
'module add <module>' - adds a module to your environment for this session
'module initadd <module>' - configure module to be loaded at every login
-------------------------------------------------------------------------------
-bash-4.2$ logout
Connection to dcalph000 closed.
Apptainer> cat ~/.ssh/config
SendEnv ESI_HOME
Thanks @Shraddha_Kiran. Itās good to see you can ssh
into the remote at least, but it seems the configuration is not stored in the ~/.ssh/config
. Where are hostname, user etc of dcalph000
configured?
Can you also show the output of
verdi computer show uhpc
I guess the ssh information is in ESI_HOME
, the SendEnv
will pass environment variables to the ssh. If there are no sensitive information, maybe @Shraddha_Kiran can show ESI_HOME
by echo $ESI_HOME
?
We can then give you an recommendation on what is the required parameters set for verdi compture configuration
.
One thing you can already try is run verdi computer configure core.ssh uhpc
again and when ask for proxy_jump
, you give empty string ""
. This is the source of the exception you mentioned.
Thanks for joining in @jusong.yu!
Hmm, I didnāt know SendEnv
could be used to provide connection configuration. I thought it would simply set the environment variables after connecting.
Are you sure about this? I thought youād have to use !
to properly unset the configuration, since the proxy_jump
and proxy_command
variable are still considered to be set even if they are an empty string.
I guess you are correct, I am not sure .
Thank you @jusong.yu and @mbercx for your suggestions.
Hereās the output of verdi computer show uhpc
Apptainer> verdi computer show uhpc
Warning: You are currently using a post release development version of AiiDA: 2.4.0.post0
Warning: Be aware that this is not recommended for production and is not officially supported.
Warning: Databases used with this version may not be compatible with future releases of AiiDA
Warning: as you might not be able to automatically migrate your data.
--------------------------- ------------------------------------
Label uhpc
PK 1
UUID 06e9e96d-bf21-486d-a2d9-0146582b90b3
Description uhpc master node
Hostname dcalph000
Transport type core.ssh
Scheduler type core.slurm
Work directory /dat/usr/x0144578/
Shebang #!/bin/bash
Mpirun command mpirun -np {tot_num_mpiprocs}
Default #procs/machine 16
Default memory (kB)/machine 32
Prepend text
Append text
--------------------------- ------------------------------------
Apptainer> cat ~/.ssh/config
SendEnv ESI_HOME
Apptainer>
Apptainer> echo $ESI_HOME
I am also attaching the env set on the container environment just FYI
Apptainer> env
BASH_FUNC_switchml()=() { typeset swfound=1;
if [ "${MODULES_USE_COMPAT_VERSION:-0}" = '1' ]; then
typeset swname='main';
if [ -e /cm/local/apps/environment-modules/4.4.0//libexec/modulecmd.tcl ]; then
typeset swfound=0;
unset MODULES_USE_COMPAT_VERSION;
fi;
else
typeset swname='compatibility';
if [ -e /cm/local/apps/environment-modules/4.4.0//libexec/modulecmd-compat ]; then
typeset swfound=0;
MODULES_USE_COMPAT_VERSION=1;
export MODULES_USE_COMPAT_VERSION;
fi;
fi;
if [ $swfound -eq 0 ]; then
echo "Switching to Modules $swname version";
source /cm/local/apps/environment-modules/4.4.0//init/bash;
else
echo "Cannot switch to Modules $swname version, command not found";
return 1;
fi
}
BASH_FUNC_module()=() { _module_raw "$@" 2>&1
}
BASH_FUNC__module_raw()=() { unset _mlshdbg;
if [ "${MODULES_SILENT_SHELL_DEBUG:-0}" = '1' ]; then
case "$-" in
*v*x*)
set +vx;
_mlshdbg='vx'
;;
*v*)
set +v;
_mlshdbg='v'
;;
*x*)
set +x;
_mlshdbg='x'
;;
*)
_mlshdbg=''
;;
esac;
fi;
unset _mlre _mlIFS;
if [ -n "${IFS+x}" ]; then
_mlIFS=$IFS;
fi;
IFS=' ';
for _mlv in ${MODULES_RUN_QUARANTINE:-};
do
if [ "${_mlv}" = "${_mlv##*[!A-Za-z0-9_]}" -a "${_mlv}" = "${_mlv#[0-9]}" ]; then
if [ -n "`eval 'echo ${'$_mlv'+x}'`" ]; then
_mlre="${_mlre:-}${_mlv}_modquar='`eval 'echo ${'$_mlv'}'`' ";
fi;
_mlrv="MODULES_RUNENV_${_mlv}";
_mlre="${_mlre:-}${_mlv}='`eval 'echo ${'$_mlrv':-}'`' ";
fi;
done;
if [ -n "${_mlre:-}" ]; then
eval `eval ${_mlre}/usr/bin/tclsh /cm/local/apps/environment-modules/4.4.0/libexec/modulecmd.tcl bash '"$@"'`;
else
eval `/usr/bin/tclsh /cm/local/apps/environment-modules/4.4.0/libexec/modulecmd.tcl bash "$@"`;
fi;
_mlstatus=$?;
if [ -n "${_mlIFS+x}" ]; then
IFS=$_mlIFS;
else
unset IFS;
fi;
unset _mlre _mlv _mlrv _mlIFS;
if [ -n "${_mlshdbg:-}" ]; then
set -$_mlshdbg;
fi;
unset _mlshdbg;
return $_mlstatus
}
SHELL=/bin/bash
HISTCONTROL=ignoredups
SLURM_TIME_FORMAT=%b %e %k:%M
HOSTNAME=dcalph078
HISTSIZE=1000
S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0
LANGUAGE=en_US.UTF-8
SINGULARITY_NAME=aiida-core-with-services_edge.sif
QT_GRAPHICSSYSTEM_CHECKED=1
PGSQL_VERSION=15
_LMFILES__modshare=/cm/shared/modulefiles/slurm/19.05.7:1
LIBRARY_PATH_modshare=/cm/shared/apps/slurm/19.05.7/lib64/slurm:1:/cm/shared/apps/slurm/19.05.7/lib64:1
CPATH_modshare=/cm/shared/apps/slurm/19.05.7/include:1
SINGULARITY_ENVIRONMENT=/.singularity.d/env/91-environment.sh
MANPATH_modshare=/usr/local/share/man:1:/usr/share/man/overrides:1:/cm/local/apps/environment-modules/4.4.0//share/man:1:/cm/local/apps/environment-modules/current/share/man:1:/cm/shared/apps/slurm/19.05.7/man:1:/usr/share/man:1
ENV=/user/x0144578/.kshrc
PWD=/user/x0144578
LOGNAME=x0144578
MODULESHOME=/cm/local/apps/environment-modules/4.4.0/
MANPATH=/cm/shared/apps/slurm/19.05.7/man:/cm/local/apps/environment-modules/4.4.0//share/man:/usr/local/share/man:/usr/share/man/overrides:/usr/share/man:/cm/local/apps/environment-modules/current/share/man
SYSTEM_USER=aiida
USER_PATH=/cm/shared/apps/slurm/19.05.7/sbin:/cm/shared/apps/slurm/19.05.7/bin:/cm/local/apps/environment-modules/4.4.0//bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/sbin:/cm/local/apps/environment-modules/4.4.0/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin
APPTAINER_ENVIRONMENT=/.singularity.d/env/91-environment.sh
APPTAINER_APPNAME=
HOME=/user/x0144578
LANG=en_US.UTF-8
SINFO_FORMAT=%n %.10T %.5a %.8e %.7m %.4c %.10G %.8O %C %f
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:
LD_LIBRARY_PATH_modshare=/cm/shared/apps/slurm/19.05.7/lib64/slurm:1:/cm/shared/apps/slurm/19.05.7/lib64:1
APPTAINER_COMMAND=shell
SINGULARITY_CONTAINER=/dat/usr/x0144578/singularity/aiida-core-with-services_edge.sif
SSH_CONNECTION=10.141.1.200 39192 10.141.1.78 22
SQUEUE_PARTITION=test,interact,license,lic_low,normal,low,open,gpu,gpu_open,short,high
PATH_modshare=/cm/local/apps/environment-modules/4.4.0//bin:1:/usr/sbin:1:/usr/bin:1:/cm/local/apps/environment-modules/4.4.0/bin:1:/cm/shared/apps/slurm/19.05.7/sbin:1:/usr/local/sbin:1:/cm/shared/apps/slurm/19.05.7/bin:1:/usr/local/bin:1:/sbin:1
RMQ_VERSION=3.10.18
APPTAINER_CONTAINER=/dat/usr/x0144578/singularity/aiida-core-with-services_edge.sif
LOADEDMODULES_modshare=slurm/19.05.7:1
TERM=xterm
LESSOPEN=||/usr/bin/lesspipe.sh %s
USER=x0144578
LIBRARY_PATH=/cm/shared/apps/slurm/19.05.7/lib64/slurm:/cm/shared/apps/slurm/19.05.7/lib64
LOADEDMODULES=slurm/19.05.7
SHLVL=2
BASH_ENV=/cm/local/apps/environment-modules/4.4.0//init/bash
CONDA_DIR=/opt/conda
CVS_RSH=ssh
APPTAINER_NAME=aiida-core-with-services_edge.sif
SINGULARITY_BIND=
XDG_SESSION_ID=134554
APPTAINER_BIND=
LD_LIBRARY_PATH=/.singularity.d/libs
XDG_RUNTIME_DIR=/run/user/32752
PS1=Apptainer>
SSH_CLIENT=10.141.1.200 39192 22
SYSTEM_UID=1000
ENABLE_LMOD=0
SQUEUE_SORT=U,P,N
LC_ALL=en_US.UTF-8
XDG_DATA_DIRS=/user/x0144578/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share
PATH=/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home//.local/bin
MODULEPATH=/cm/local/modulefiles:/cm/shared/modulefiles
_LMFILES_=/cm/shared/modulefiles/slurm/19.05.7
MAIL=/var/spool/mail/x0144578
SSH_TTY=/dev/pts/3
SQUEUE_FORMAT2=jobid:8,username:9,statecompact:3,partition:13,name:20,command:20,submittime:13,numcpus:5,gres:12,feature:25,numnodes:6,reasonlist:90
SYSTEM_GID=100
CPATH=/cm/shared/apps/slurm/19.05.7/include
DEBIAN_FRONTEND=noninteractive
MODULES_CMD=/cm/local/apps/environment-modules/4.4.0/libexec/modulecmd.tcl
_=/usr/bin/env
Thanks @Shraddha_Kiran,
I still donāt really know how your ssh to dcalph000
is configured to be honest. Iām no sysadmin however, itās probably some approach Iām not familiar with.
In any case, letās just try setting up a new computer, but this time not specifying the proxy settings. Create two files with the following names and contents:
uhpc-test.yaml
:
label: uhpc-test
description: uhpc master node
hostname: dcalph000
transport: core.ssh
scheduler: core.slurm
shebang: '#!/bin/bash'
work_dir: /dat/usr/x0144578/
mpirun_command: mpirun -np {tot_num_mpiprocs}
mpiprocs_per_machine: 16
prepend_text: ' '
append_text: ' '
uhpc-test-configure.yaml
:
username: x0144578
key_filename: /user/x0144578/rsm_ppk.ppk
safe_interval: 10.0
(Also maybe check if I havenāt made any typos here)
Then first set up the new computer:
verdi computer setup -n --config uhpc-test.yaml
And subsequently configure the core.ssh
transport for it:
verdi computer configure core.ssh uhpc-test -n --config uhpc-test-configure.yaml
then test the computer, and pray
verdi computer test uhpc-test
Or at least report back in case there are any issues. ^^
Hi Marnik,
After following the given steps, we are getting error that getting invalid configuration file.
(aiida) x0144578@dcalph078:~$ vi uhpc-test.yaml
(aiida) x0144578@dcalph078:~$ cat uhpc-test.yaml
username: x0144578
key_filename: /user/x0144578/rsm_ppk.ppk
safe_interval: 10.0
(aiida) x0144578@dcalph078:~$
(aiida) x0144578@dcalph078:~$ verdi computer setup -n --config uhpc-test.yaml
Usage: verdi computer setup [OPTIONS]
Try 'verdi computer setup --help' for help.
Error: Invalid value for '--config': Invalid configuration file, the following keys are not supported: {'safe_interval', 'username', 'key_filename'}
(aiida) x0144578@dcalph078:~$
Thanks
Arun parasd P
Dear @Arun,
Apologies, I was a bit too quick. I mixed up the āsetupā file contents (uhpc-test.yaml
) with the āSSH configureā one (uhpc-test-configure.yaml
). Iāve corrected my post above, can you give it another try?
Best,
Marnik
Hello @mbercx
Are we sure that after we configure aiida to user remote HPC, it uses ssh connection in the backend to connect to remote HPC?
The computer has the core.ssh
transport configured, which uses paramiko
to connect to the remote, see:
paramiko
is a Python implementation of the SSH protocol:
Hello @mbercx
Your suggestion worked and we were able to ssh to our remote HPC successfully. Thank you again!
Now that we can access our HPC environment, what is the simplest way to test aiida-quantumespresso?
Note: We already have quantumespresso ( qe-6.8 ) available on our remote HPC
Regards
Shraddha
Best approach would be to follow the get started instructions of the documentation: Get started ā aiida-quantumespresso documentation
Since you already configured the computer, you can skip that step and go to configuring the code. When you finished that and installed the pseudo potentials, you can test running a pw.x calculation with
aiida-quantumespresso calculation launch pw -X <CODE> -F SSSP/1.2/PBE/efficiency
replacing <CODE>
with the label of the code you setup.