View on GitHub

Carme-Docu

Documentation Project for Carme

Installation documentation

If you have a suggestion or a question that is not resolved in this documentation, please contact the Carme Team:

carme@itwm.fraunhofer.de

Note: The documentation provided here allows you to install Carme-demo. We do not recommend this installation in production-mode.

This documentation is divided in the following sections:

Introduction

Basic options

Advanced options

What is Carme-demo

Carme-demo is a simplified version of Carme. It excludes advanced features that are relevant in production mode.

Carme-demo is easy to install. You can test it in Debian and RedHat based systems (including WSL). Give it a try and enjoy.

In detail:

Features Carme-demo Carme
LDAP Set in Debian / Not set in RedHat Required
Authentication Login + 2FA Login + 2FA
Multi-users Set Set
GPUs/CPUs Set Set
TLS Not set Set
Projects App Not Set Set
Management Scripts Not Set Set
IDEs/Tools JupyterLab and Code-Server JupyterLab, Code-Server, GPI, and more
Cluster supports 1 head-node and >1 compute-nodes supports a login-node, a head-node, backup-nodes, and compute-nodes

System requirements

For an optimal installation, your system must fulfill the following requirements:

Features & next release

Carme-demo v1.0 (current version)

Carme-demo v1.1 (next release)

How to install Carme-demo

Carme-demo is easy to install. Once your cluster or your single device is set with the system requirements, you are ready to go.

Windows users:

Step 1: Clone the repo

As root user, in the terminal type (in clusters use the head-node):

git clone -b demo-1.0 --single-branch https://github.com/CarmeTeam/Carme.git /opt/Carme

Note: The repo must be cloned to the /opt/Carme directory.

Step 2: Create the config file

cd /opt/Carme && bash config.sh

Note: You don’t need to modify the config file unless you want to customize it:

Step 3: Run the installation script

bash start.sh

Note: If the install fails, refer to: What to do if the install fails.

How to access Carme-demo

How to use Carme-demo

Refer to the following link:

How to remove Carme-demo

Carme-demo is easy to remove. In the terminal type (in clusters use the head-node):

cd /opt/Carme && bash end.sh

Note: If the uninstall fails, refer to: What to do if the uninstall fails.

How to customize the config file

bash config.sh creates and customizes the config file /opt/Carme/CarmeConfig.start. If advanced customization is needed, you can manually do so.

Below we show all the config file variables:

USER/ADMIN

Variable Definition
CARME_UID="1000" Linux user uid, e.g., id -u ubuntu.
CARME_USER="ubuntu" Linux user.
CARME_HOME="/home/ubuntu" Linux user home folder.
CARME_GROUP="ubuntu" Linux user group, e.g., id -gn ubuntu.
CARME_USERS="single" Single-user software stack. Do not modify this variable.
CARME_SYSTEM="multi" The system is a cluster. For single devices consider CARME_SYSTEM="single".
CARME_TIMEZONE="Europe/Berlin" Choose your timezone, i.e., timedatectl list-timezones.

PASSWORDS

Variable Definition
CARME_PASSWORD_USER="usrpwd" Single-user software stack does not require this variable.
CARME_PASSWORD_MYSQL="mysqlpwd" MySQL root password. Change this passsword if you use an already existing MySQL/MariaDB.
CARME_PASSWORD_SLURM="slurmpwd" SLURM password to control the database slurm_acct_db. Change this password if you use an already existing SLURM.
CARME_PASSWORD_DJANGO="djangopwd" Carme-frontend password to control the database webfrontend.

DATABASE

Variable Definition
CARME_DB="yes" Installs MySQL/MariaDB. CARME_DB="no" uses an already existing MySQL/MariaDB. If you choose to install MySQL/MariaDB, but you already have MySQL/MariaDB installed, then Carme will ask you if you want to reinstall the database management tool.
CARME_DB_SERVER="mysql" Uses MySQL. amd64 architectures use MySQL. arm64 architectures use MariaDB. If you prefer MariaDB in amd64, then consider CARME_DB_SERVER="mariadb".
CARME_DB_DEFAULT_NAME="webfrontend" Carme-frontend database name. If you are using an already existing MySQL/MariaDB, then check that the database name webfrontend is not used in a different project. If it is, then change the name. Note that Carme does NOT overwrite an existing webfrontend database. It will only add Carme tables to it.
CARME_DB_DEFAULT_NODE="head-node" Head-node name, i.e., hostname -s. In single-devices CARME_DB_DEFAULT_NODE="localhost". If you are using an already existing MySQL/MariaDB, consider the hostname where your database server containing the webfrontend database is installed.
CARME_DB_DEFAULT_HOST="head-node" Head-node name, i.e., hostname -s. In single-devices CARME_DB_DEFAULT_HOST="localhost". If you are using an already existing MySQL/MariaDB, consider the hostname where your database server containing the webfrontend database is installed.
CARME_DB_DEFAULT_USER="django" User name to handle webfrontend database.
CARME_DB_DEFAULT_PORT=3306 MySQL/MariaDB port where webfrontend exits. If you use a different port, then change it accordingly.
CARME_DB_SLURM_NAME="slurm_acct_db" SLURM accounting database name. If you are using an already existing SLURM, then Carme will use your already existing slurm_acct_db database. Carme does NOT overwrite/modify your already existing database, this is managed by SLURM only.
CARME_DB_SLURM_NODE="head-node" Head-node name, i.e., hostname -s. In single-devices CARME_DB_SLURM_NODE="localhost". If you are using an already existing MySQL/MariaDB, consider the hostname where your database server containing the slurm_acct_db database is installed.
CARME_DB_SLURM_HOST="head-node" Head-node name, i.e., hostname -s. In single-devices CARME_DB_SLURM_HOST="localhost". If you are using an already existing MySQL/MariaDB, consider the hostname where your database server containing the slurm_acct_db database is installed.
CARME_DB_SLURM_USER="slurm" SLURM user name to handle slurm_acct_db database. If you are using an already existing SLURM, then this user is set in your SLURM configuration.
CARME_DB_SLURM_PORT=3306 MySQL/MariaDB port where slurm_acct_db exists. If you use a different port, then change it accordingly.

SLURM

Note: Advanced SLURM features can be implemented manually.

Variable Definition
CARME_SLURM="yes" Installs SLURM. CARME_SLURM="no" uses an already existing SLURM. If you choose to install SLURM, but you already have SLURM installed, then Carme will ask you if you want to reinstall the workload management tool.
CARME_SLURM_CLUSTER_NAME="mycluster" Is your SLURM cluster name. Choose the name that you want. If you are using an already existing SLURM, then your cluster name is given with sacctmgr show cluster.
CARME_SLURM_PARTITION_NAME="carme" Is your SLURM partition name. Choose the name that you want. If you are using an already existing SLURM, you may have more than one partition.
CARME_SLURM_ACCELERATOR_TYPE="cpu" Enforces Carme-demo to work with CPUs only. (GPUs will be included in the next Carme-demo release).
CARME_SLURM_SLURMCTLD_PORT=6817 Is the SLURM controller port. If you use an already existing SLURM, this port may be different. Refer to SlurmctldPort in your slurm.conf to know you actual port.
CARME_SLURM_SLURMD_PORT=6818 Is the SLURM daemon port. If you use an already existing SLURM, this port may be different. Refer to SlurmdPort in your slurm.conf to know you actual port.

VENDORS

Note: Mambaforge, Singularity, and Go are installed in /opt/Carme/Carme-Vendors, Traefik proxy is installed in the container image opt/Carme/Carme-ContainerImage/Carme-Proxy-Container/proxy.sif. If you have similar vendors in your system, they won’t interfere with Carme-Vendors. Carme-Vendors must be installed in your system.

Variable Definition
MAMBAFORGE_VERSION=23.11.0-0 Go to https://github.com/conda-forge/miniforge/releases to choose a different mambaforge version.
SINGULARITY_VERSION=3.11.4 Go to https://github.com/sylabs/singularity/releases to choose a different singularity version.
PROXY_VERSION=2.11.2 Go to https://github.com/traefik/traefik/releases to choose a different traefik version.
GO_VERSION=1.20.6 Go to https://go.dev/dl/ to choose a different go version.

FRONTEND

Variable Definition
CARME_FRONTEND_KEY="..." Carme-frontend security key. To create a new one, go to https://djecrety.ir. Note that your key must not contain the character ".
CARME_FRONTEND_NODE="head-node" Head-node name, i.e., hostname -s. In single devices CARME_FRONTEND_NODE="localhost".
CARME_FRONTEND_URL="localhost" Default URL. Do not modify this variable.
CARME_FRONTEND_IP="10.0.0.27" Head-node IP, i.e., hostname -I. In single devices CARME_FRONTEND_IP="127.0.0.1".
CARME_FRONTEND_ID="Carme" Carme-frontend ID. Do not modify this variable.
CARME_FRONTEND_PORT=8888 Carme-frontend port. If you are already using port ` 8888`, choose a different one.

BACKEND

Variable Definition
CARME_BACKEND_NODE="head-node" Head-node name, i.e., hostname -s. In single devices CARME_BACKEND_NODE="localhost".
CARME_BACKEND_PORT=56798 Carme-backend port. If you are already using port ` 56798`, choose a different one.

NODES

Variable Definition
CARME_NODE_LIST="cnode1 cnode2" List of compute-nodes names, i.e., hostname -s. In single devices CARME_NODE_LIST="localhost".
CARME_NODE_FS="yes" Do not modify this variable.
CARME_NODE_SSHD="yes" Do not modify this variable.
CARME_NODE_SSD_PATH="/scratch" Creates a scratch directory. Do not modify this variable.
CARME_NODE_TMP_PATH="/tmp" Uses the tmp directory. Do not modify this variable.

How to configure an already existing MySQL/MariaDB

If you already have MySQL/MariaDB installed in your system, then when you run bash config.sh, choose no when requested if you want to install a database management tool. The rest is handled by Carme.

How to configure an already existing SLURM

If you already have SLURM installed in your system, then when you run bash config.sh, choose no when requested if you want to install SLURM. The rest is handled by Carme.

What to do if the install fails

The install is made of 10 sub-scripts that are run in order. You must not alter this order:

  1. install_system.sh
  2. install_database.sh
  3. install_slurm.sh
  4. install_vendors.sh
  5. install_certs.sh
  6. install_frontend.sh
  7. install_backend.sh
  8. install_base.sh
  9. install_scripts.sh
  10. install_proxy.sh

What to do if the uninstall fails

The uninstall is made of 8 sub-scripts that are run in order. You must not alter this order:

  1. remove_proxy.sh
  2. remove_base.sh
  3. remove_backend.sh
  4. remove_frontend.sh
  5. remove_certs.sh
  6. remove_vendors.sh
  7. remove_slurm.sh
  8. remove_database.sh

How to install WSL in a Windows device

Open the Windows PowerShell and type:

wsl --install

Note: By default, Ubuntu Linux is installed.

In the process you will be asked to:

Enter new UNIX username:
password:

Once the installation completes, you have access to the Ubuntu terminal. If you open a new PowerShell, type wsl.exe to access the Ubuntu terminal.

To install Carme-demo, you must be a root user. In the terminal type:

sudo su 

Carme is installed in the /opt directory, then:

cd /opt

Now, you are ready to clone the repo and install Carme-demo. Follow the steps given in How to install Carme-demo.

If you don’t want to install Carme-demo in your active WSL, you can create a test environment considering a separate WSL distribution. Refer to: How to install Carme-demo in a Windows device considering a WSL test environment.

How to install Carme-demo in a Windows device considering a WSL test environment

e.g., choose one of the following versions:

Ubuntu 20.04 test environment

Open the Windows PowerShell.

To download the WSL tar file for Ubuntu 20.04, type:

Invoke-WebRequest https://cloud-images.ubuntu.com/releases/focal/release/ubuntu-20.04-server-cloudimg-amd64-wsl.rootfs.tar.gz -OutFile ubuntu-20.04-server-cloudimg-amd64-wsl.rootfs.tar.gz

Import the tar file as a new Ubuntu distribution:

wsl --import carme-ubuntu20.04 carme-ubuntu20.04 ubuntu-20.04-server-cloudimg-amd64-wsl.rootfs.tar.gz

Delete the tar file:

Remove-Item -Recurse ubuntu-20.04-server-cloudimg-amd64-wsl.rootfs.tar.gz

Access the terminal of the new Ubuntu distribution:

wsl -d carme-ubuntu20.04

Now you are in the Ubuntu terminal, enable systemd and then exit back to the PowerShell:

cat << 'EOF' >> /etc/wsl.conf
[boot]
systemd=true
EOF
exit

In the PowerShell, restart the new distribution:

wsl --terminate carme-ubuntu20.04
wsl -d carme-ubuntu20.04

Now you are back to the Ubuntu terminal. Add a new user (in this example the new user is ubuntu):

adduser --gecos "" --disabled-password ubuntu
echo "ubuntu:password" | chpasswd

Clone the repository to the /opt/Carme directory:

git clone -b demo-1.0 --single-branch https://github.com/CarmeTeam/Carme.git /opt/Carme

Change into the /opt/Carme directory and then start the installation:

cd /opt/Carme/ && bash config.sh && bash start.sh

Once the installation is finished, you can access Carme-demo. Open a browser and type in the URL box:

localhost:10443

If the installation fails, refer to: What to do if the install fails.

To test Carme-demo, refer to: How to use Carme-demo. Once you finish testing Carme-demo, you can discard the distribution:

wsl --terminate carme-ubuntu20.04
wsl --unregister carme-ubuntu20.04
Remove-Item -Recurse carme-ubuntu20.04

If you like Carme-demo, you can install it in your main WSL distribution. In the PoweShell type wsl.exe and follow the steps given in: How to install Carme-demo.

Ubuntu 22.04 test environment

Open the Windows PowerShell.

To download the WSL tar file for Ubuntu 22.04, type:

   Invoke-WebRequest https://cloud-images.ubuntu.com/wsl/releases/22.04/current/ubuntu-jammy-wsl-amd64-wsl.rootfs.tar.gz -OutFile ubuntu-jammy-wsl-amd64-wsl.rootfs.tar.gz

Import the tar file as a new Ubuntu distribution:

wsl --import carme-ubuntu22.04 carme-ubuntu22.04 ubuntu-jammy-wsl-amd64-wsl.rootfs.tar.gz

Delete the tar file:

Remove-Item -Recurse ubuntu-jammy-wsl-amd64-wsl.rootfs.tar.gz

Access the terminal of the new Ubuntu distribution:

wsl -d carme-ubuntu22.04

Now you are in the Ubuntu terminal, enable systemd and then exit back to the PowerShell:

cat << 'EOF' >> /etc/wsl.conf
[boot]
systemd=true
EOF
exit

In the PowerShell, restart the new distribution:

wsl --terminate carme-ubuntu22.04
wsl -d carme-ubuntu22.04

Now you are back to the Ubuntu terminal. Add a new user (in this example the new user is ubuntu):

adduser --gecos "" --disabled-password ubuntu
echo "ubuntu:password" | chpasswd

Clone the repository to the /opt/Carme directory:

git clone -b demo-1.0 --single-branch https://github.com/CarmeTeam/Carme.git /opt/Carme

Change into the /opt/Carme directory and then start the installation:

cd /opt/Carme/ && bash config.sh && bash start.sh

Once the installation is finished, you can access Carme-demo. Open a browser and type in the URL box:

localhost:10443

If the installation fails, refer to: What to do if the install fails.

To test Carme-demo, refer to: How to use Carme-demo. Once you finish testing Carme-demo, you can discard the distribution:

wsl --terminate carme-ubuntu22.04
wsl --unregister carme-ubuntu22.04
Remove-Item -Recurse carme-ubuntu22.04

If you like Carme-demo, you can install it in your main WSL distribution. In the PoweShell type wsl.exe and follow the steps given in: How to install Carme-demo.

Rocky 9 test environment

Open the Windows PowerShell.

To download the WSL tar file for Ubuntu 22.04, type:

   Invoke-WebRequest https://dl.rockylinux.org/pub/rocky/9/images/x86_64/Rocky-9-Container-Base.latest.x86_64.tar.xz -OutFile Rocky-9-Container-Base.latest.x86_64.tar.xz

Import the tar file as a new Rocky distribution:

wsl --import carme-rocky9 carme-rocky9 Rocky-9-Container-Base.latest.x86_64.tar.xz

Delete the tar file:

Remove-Item -Recurse Rocky-9-Container-Base.latest.x86_64.tar.xz

Access the terminal of the new Ubuntu distribution:

wsl -d carme-rocky9

Now you are in the Rocky terminal, enable systemd and then exit back to the PowerShell:

dnf install systemd -y
cat << 'EOF' >> /etc/wsl.conf
[boot]
systemd=true
EOF
exit

In the PowerShell, restart the new distribution:

wsl --terminate carme-rocky9
wsl -d carme-rocky9

Now you are back to the Rocky terminal. Add a new user (in this example the new user is rocky):

dnf install -y 'dnf-command(config-manager)'
dnf config-manager --set-enabled crb
dnf install -y epel-release
dnf clean all
adduser rocky
echo "rocky:password" | chpasswd

Clone the repository to the /opt/Carme directory:

git clone -b demo-1.0 --single-branch https://github.com/CarmeTeam/Carme.git /opt/Carme

Change into the /opt/Carme directory and then start the installation:

cd /opt/Carme/ && bash config.sh && bash start.sh

Once the installation is finished, you can access Carme-demo. Open a browser and type in the URL box:

localhost:10443

If the installation fails, refer to: What to do if the install fails.

To test Carme-demo, refer to: How to use Carme-demo. Once you finish testing Carme-demo, you can discard the distribution:

wsl --terminate carme-rocky9
wsl --unregister carme-rocky9
Remove-Item -Recurse carme-rocky9

If you like Carme-demo, you can install it in your main WSL distribution. In the PoweShell type wsl.exe and follow the steps given in: How to install Carme-demo.

How to set SSH keys in a cluster

Let’s consider that your cluster is made of 1 head-node a 2 compute-nodes. In each node, hostname -s and hostname -I gives, e.g.,

node hostname -s hostname -I
head node carmec0 10.0.0.1
compute node 1 carmec1 10.0.0.10
compute node 2 carmec2 10.0.0.11

Step 1: Modify /etc/hosts

In the head node, /etc/hosts should have:

127.0.0.1       localhost
127.0.1.1       carmec0

10.0.0.1        carmec0
10.0.0.10       carmec1
10.0.0.11       carmec2

In the compute node 1, /etc/hosts should have:

127.0.0.1       localhost
127.0.1.10      carmec1

10.0.0.1        carmec0
10.0.0.10       carmec1
10.0.0.11       carmec2

And in the compute node 2, /etc/hosts should have:

127.0.0.1       localhost
127.0.1.1       carmec2

10.0.0.1        carmec0
10.0.0.10       carmec1
10.0.0.11       carmec2

Step 2: Create the SSH keys

In the head node, type:

ssh-keygen -t ed25519 -N="" -C "root@carmec0"

This creates your passphraseless ssh key in /root/.ssh/. Open the .pub key, i.e.,

cat id_ed25519.pub

Copy the output to /root/.ssh/authorized_keys in the head-node.

Copy the output to /root/.ssh/authorized_keys in the compute-nodes.

Congratulations! Now you can ssh from the head-node to itself considering ssh carmec0 and ssh localhost, and from the head-node to the compute-nodes considering ssh carmec1 and ssh carmec2.