View on GitHub

Carme-Docu

Documentation Project for Carme

Install NOTES

System requirements

In order to install Carme, you should have the following basic setup

Hardware

Software

Using existing HPC Setups

If you are already running an HPC system with a parallel FS and Slurm, you can simply add Carme on top of this setup. We recommend to create new Slurm partitions for Carme.

Things Carme can’t do (yet)

Install Components needed by Carme

In order to run and use Carme you have to install and configure some components. In each of the folder you find either scripts that do most of the basic installations for you and READMEs that provide instructions and/or additional information.

Before you continue make sure that the entire Carme repository is located in /opt of your headnode

# git clone https://github.com/CarmeTeam/Carme.git

Besides this, make sure that the following folders are mounted

First Step (basic setup)

Before you run any of the scripts, make sure that you edited CarmeConfig (see CarmeConfig Docs according to your needs. This is essential as each script here (and later in productive mode) depends on CarmeConfig.

In order to create CarmeConfig you can copy the CarmeConfig_blanco to a new CarmeConfig

# cp /opt/Carme/CarmeConfig_blanco /opt/Carme/CarmeConfig

Then you edit CarmeConfig using e.g. vim (or nano)

# vim /opt/Carme/CarmeConfig

Note that inside the file are examples (and or default) values for each variable that is defined in CarmeConfig.

After everything is configured you should make the file only accessible to root

# chmod 600 /opt/Carme/CarmeConfig

Second Step (basic installations)

The next steps depend on your needs. If you start from scratch you find here scripts an additional information to install

For LDAP, SLURM and Singularity we provide scripts that do the basic installation and configuration for you, so that get a good starting point. For Zabbix and Mattermost we recommend to follow their installation and configuration instructions (links can be found in the respective folders).

In addition you should at this point install the relevant parts for

After this installation steps you have to prepare everything for the logging

Third Step (basic initialization)

Cronjobs

There are a few scripts that should be added to your crontab. For more information have a look at the respective documentaion.

headnode
compute nodes

What you should have so far

In order to proceed we assume that the following things are available

optional (if not needed skip in the following steps that include one of these)

Configure and Start Came

Generate SSL Certs

Carme is using SSL certs to authenticate and encrypt communication between the jobs, the frontend and the backend. Scripts to generate these certs an keys are located in /opt/Carme/Carme-Backend/SSL/. To create backend key and cert execute the following

# cd /opt/Carme/Carme-Backend/SSL/
# openssl genrsa  -out backend.key 4096
# openssl req -new -x509 -days 3650 -key backend.key -out backend.crt
# chmod 600 backend.key
# chmod 600 backend.crt

NOTE: backend key and cert have to be in /opt/Carme/Carme-Backend/SSL/ and are only readable by root

create frontend key and cert

Next we have to create a key and cert for the frontend. Note that you have to adjust the following commands according to the values you have defined in the CarmeConfig!

# cd /opt/Carme/Carme-Backend/SSL/
# openssl genrsa -out frontend.key 4096
# openssl req -new -key frontend.key -out frontend.csr -subj "/C=CARME_SSL_C/ST=CARME_SSL_ST/L=CARME_SSL_L/O=CARME_SSL_O/OU=CARME_SSL_OU/CN=CLUSTER_USER/emailAddress=frontend@CARME_SSL_EMAIL_BASE" -passin pass:""
# openssl x509 -req -days 3652 -in frontend.csr -CA backend.crt -CAkey backend.key -set_serial 01 -out frontend.crt
# rm frontend.csr
# chown www-data:www-data frontend.key
# chown www-data:www-data frontend.crt
# mkdir -p /opt/Carme/Carme-Frontend/Carme-Django/webfrontend/SSL
# mv frontend.key /opt/Carme/Carme-Frontend/Carme-Django/webfrontend/SSL/frontend.key
# mv frontend.crt /opt/Carme/Carme-Frontend/Carme-Django/webfrontend/SSL/frontend.crt
create slurmctld key and cert

Next we have to create a key and cert for the callbacks of the slurmctld. Note that you have to adjust the following commands according to the values you have defined in the CarmeConfig!

# cd /opt/Carme/Carme-Backend/SSL/
# openssl genrsa -out slurmctld.key 4096
# openssl req -new -key slurmctld.key -out slurmctld.csr -subj "/C=CARME_SSL_C/ST=CARME_SSL_ST/L=CARME_SSL_L/O=CARME_SSL_O/OU=CARME_SSL_OU/CN=CLUSTER_USER/emailAddress=slurmctld@CARME_SSL_EMAIL_BASE" -passin pass:""
# openssl x509 -req -days 3652 -in slurmctld.csr -CA backend.crt -CAkey backend.key -set_serial 01 -out slurmctld.crt
# rm slurmctld.csr
# chown slurm:slurm slurmctld.key
# chown slurm:slurm slurmctld.crt
# chmod 600 slurmctld.key
# chmod 600 slurmctld.crt
# mv slurmctld.key /opt/Carme/Carme-Scripts/backend/slurmctld.key
# mv slurmctld.crt /opt/Carme/Carme-Scripts/backend/slurmctld.crt
create user certs

Next we have to create certs for the users as well. Note that this has to be done every time you add a new user to the system!

The creation can be done with the script createAndDeployUserCarts.sh located in /opt/Carme/Carme-Backend/SSL/.

Start the Carme-Backend

For testing, start the backend-server on the head-node like this:

/usr/bin/python3 /opt/development/backend/Python/carme_backend.py

depending on the debug level set in CarmeConfig, the backend server will produce additional log output on all connections.

Running the backend permanently, one should add it to the system init: e.g. /etc/systemd/system/carme-backend.service

[Unit]
Description=CarmeBackeind
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
ExecStart=/usr/bin/python3 /opt/Carme/Carme-Backend/Python/carme_backend.py
Restart=on-failure
RestartSec=1
StartLimitAction=non

[Install]
WantedBy=multi-user.target

Start Proxy and Webfrontend

Before you continue make sure that the Singularity images of the proxy and webfronend are created (see install instructions) and copied to the login node, e.g. to

Note that the scripts

have to be copied to the respective locations as well!

Then we can start them (on the login node!) with

# cd /opt/Carme-Proxy-Container
# bash run-carme-proxy.sh start

and

# cd /opt/Carme-Frontend-Container
# bash run-carme-frontend.sh start

This will start the respective Singularity containers as deamons (called instanced by Singularity).

Migrating the data base

The first time we start Carme and then after every update of Carme, we need to migrate the SQL datases used by the web-frontend. To do this we have to enter the frontend image and execute the following commands

# singularity shell instance://CarmeFrontend
# cd /opt/Carme/Carme-Frontend/Carme-Django/webfrontend
# python manage.py makemigrations
# python manage.py migrate
# exit

This should run through without error messages, populating the MySQL DB.

Migrating Static Files
# singularity shell instance://CarmeFrontend
# cd /opt/Carme/Carme-Frontend/Carme-Django/webfrontend
# python manage.py collectstatic
# exit

Last Steps

Carme should be working now. The next steps to start working on the cluster are: