Basic Quickstart

QuickStart Guide

Overview

This quickstart guide will help you get the slurm-operator running and deploy Slurm clusters to Kubernetes.

Install

Pre-Requiisites

Install the pre-requisite helm charts.

 helm repo add prometheus-community
https://prometheus-community.github.io/helm-charts helm repo add metrics-server
https://kubernetes-sigs.github.io/metrics-server/ helm repo add bitnami
https://charts.bitnami.com/bitnami helm repo add jetstack
https://charts.jetstack.io helm repo update helm install cert-manager
jetstack/cert-manager \
 --namespace cert-manager --create-namespace --set crds.enabled=true helm
install prometheus prometheus-community/kube-prometheus-stack \
 --namespace prometheus --create-namespace --set installCRDs=true

Slurm Operator

Download values and install the slurm-operator from OCI package.

 curl -L
https://raw.githubusercontent.com/SlinkyProject/slurm-operator/refs/tags/v0.1.0/helm/slurm-operator/values.yaml
\
 -o values-operator.yaml helm install slurm-operator
oci://ghcr.io/slinkyproject/charts/slurm-operator \
 --values=values-operator.yaml --version=0.1.0 --namespace=slinky
--create-namespace 

Make sure the cluster deployed successfully with:

 kubectl --namespace=slinky get pods 

Output should be similar to:

 NAME READY STATUS RESTARTS AGE
slurm-operator-7444c844d5-dpr5h 1/1 Running 0 5m00s
slurm-operator-webhook-6fd8d7857d-zcvqh 1/1 Running 0 5m00s 

Slurm Cluster

Download values and install a Slurm cluster from OCI package.

 curl -L
https://raw.githubusercontent.com/SlinkyProject/slurm-operator/refs/tags/v0.1.0/helm/slurm/values.yaml
\
 -o values-slurm.yaml helm install slurm
oci://ghcr.io/slinkyproject/charts/slurm \
 --values=values-slurm.yaml --version=0.1.0 --namespace=slurm --create-namespace

Make sure the slurm cluster deployed successfully with:

 kubectl --namespace=slurm get pods 

Output should be similar to:

 NAME READY STATUS RESTARTS AGE slurm-accounting-0 1/1
Running 0 5m00s slurm-compute-debug-0 1/1 Running 0 5m00s slurm-controller-0 2/2
Running 0 5m00s slurm-exporter-7b44b6d856-d86q5 1/1 Running 0 5m00s
slurm-mariadb-0 1/1 Running 0 5m00s slurm-restapi-5f75db85d9-67gpl 1/1 Running 0
5m00s 

Testing

To test Slurm functionality, connect to the controller to use Slurm client commands:

 kubectl --namespace=slurm exec -it
statefulsets/slurm-controller -- bash --login 

On the controller pod (e.g. host slurm@slurm-controller-0), run the following commands to quickly test Slurm is functioning:

 sinfo srun hostname sbatch --wrap="sleep 60" squeue

See Slurm Commands for more details on how to interact with Slurm.