Container

for packaged applications

Containers provide a (scientific) software together with all its dependencies in a "packaged" format. So called "container runtimes" can then interpret these and execute the application as if it has been installed locally with all its dependencies.

Overview

While container might seem a viable way of using complex, yet readily-packaged applications, their runtime implications can turn out to be more difficult.

For example, access to files on our shared file systems needs to be configured per container (eg. creating “volumes”), and most container runtime implementations are not well fitted to work from shared file systems (NFS or GPFS/SpectrumStorage our Lichtenberg is based upon).

From within a container, it is also difficult to access our software module system .

“Over”containerization

In some cases, a container recipe does nothing more than installing a certain python interpreter version plus loading some python modules. Instead of trying to run such a “simple” container, you might be better off with a similarly configured "python virtual environment ".

This is easy to create and maintain, and it works perfectly with shared file systems and with our module system.

Thus, firstly check your desired container recipe for being just such a simple one – then it would not be worth the effort getting it to run on the cluster!

On the Lichtenberg cluster, we support only the following CRIs:

Apptainer (formerly Singularity)

Partially supported (with a lot of manual efforts on our part):

podman – requires manual mapping of sub UIDs and sub GIDs and does not work well with shared file systems

Not supported is

~~docker~~ – requires elevated privileges (“root” rights)

Apptainer

Apptainer (f.k.a Singularity) provides a light-weight, portable container runtime – and since being developed with HPC in mind, does run quite well on linux clusters with shared file systems.

Setting up

To fetch and convert existing container images from docker's to apptainer's format (here, “lolcow” being used as example):

# From docker registry:

mkdir -p myCont && cd myCont
singularity build lolcow.sif docker://godlovedc/lolcow
singularity run lolcow.sif


# From docker archive ("podman pull" only works if podman is correctly configured):

mkdir -p myCont && cd myCont
podman pull docker://godlovedc/lolcow
singularity build lolcow_from_archive.sif docker-archive://$(pwd)/lolcow_docker_archive.tar
singularity run lolcow_from_archive.sif


# From a docker file:

mkdir -p myCont && cd myCont
git clone https://github.com/GodloveD/lolcow
source myenv/bin/activate
spython recipe lolcow/Dockerfile ./sing.lolcow
singularity build --fakeroot lolcow_from_dockerfile.sif sing.lolcow
singularity run lolcow_from_dockerfile.sif
deactivate

This is necessary only once on a login node, not in every batch job!

Using in a batch job

To use such converted apptainer in a batch job, simply add the two lines:

cd myCont
singularity run lolcow_from_dockerfile.sif

to your job script.

Example with Nvidia GPUs:

It is also possible to use Apptainer/Singularity with Nvidia GPUs (at the moment, AMD and Intel GPUs are more problematic). Here follows an example.

# We rserve 2 Nvidia Hopper H100 GPUs for interactive usage:

srun -t 07:00:00 -n16 --mem-per-cpu=4G --gres=gpu:h100:2 --pty /bin/bash

# We export a couple of environment variables that set the paths of the temporary files, so to be sure to have enough free space for them:

export TMPDIR=/work/scratch/${USER}/tmp; export APPTAINER_TMPDIR=${TMPDIR}; mkdir -p ${TMPDIR}

# We build an Apptainer Container Image, saving it in the file "pytorch_cont.sif"

apptainer build pytorch_cont.sif docker://nvcr.io/nvidia/pytorch:25.03-py3

# We launch a container based on this image:

apptainer shell --nv pytorch_cont.sif

# We test it a bit:

Apptainer> python
Python 3.12.3 (main, Feb  4 2025, 14:48:35) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.cuda.device_count()
2
>>> exit()