Containerize Your Code

In this tutorial, you will write a Dockerfile, build a container image, test it locally, and push it to Docker Hub so it can be used on ACE HPC. We'll work through a single example from start to finish.

What is a Dockerfile?

A Dockerfile is a plain text file that contains a sequence of instructions for building a container image. Each instruction creates a layer in the image. Docker reads the Dockerfile top-to-bottom, executing each instruction in order, and the result is an image you can run anywhere.

Here are the core instructions:

Instruction	What it does
`FROM`	Sets the base image — your starting point
`RUN`	Executes a command during the build (install packages, compile code, etc.)
`COPY`	Copies files from your machine into the image
`ENV`	Sets an environment variable that persists when the container runs
`WORKDIR`	Sets the working directory for subsequent instructions
`CMD`	Defines the default command when the container starts
`ENTRYPOINT`	Configures the container to run as an executable

Let's see how these work together in practice.

The Example: A Monte Carlo Simulation

We'll containerize a Python script that estimates the value of Pi using a Monte Carlo method. This is a common computational technique — generate random points in a square, count how many fall inside a circle, and use the ratio to estimate Pi.

Project Setup

On your local workstation (where Docker is installed), create a project directory:

$ mkdir ~/monte-carlo && cd ~/monte-carlo

The Application Code

Create a file called estimate_pi.py:

#!/usr/bin/env python3
"""Estimate Pi using a Monte Carlo method."""

import argparse
import numpy as np
import time
import json
import os

def estimate_pi(num_samples):
    """Generate random points in a unit square, count those inside the unit circle."""
    x = np.random.uniform(0, 1, num_samples)
    y = np.random.uniform(0, 1, num_samples)
    inside_circle = np.sum(x**2 + y**2 <= 1)
    return 4 * inside_circle / num_samples

def main():
    parser = argparse.ArgumentParser(description="Estimate Pi using Monte Carlo simulation")
    parser.add_argument("samples", type=int, nargs="?", default=1_000_000,
                        help="Number of random samples (default: 1,000,000)")
    parser.add_argument("--output", "-o", type=str, default=None,
                        help="Path to write JSON results")
    parser.add_argument("--seed", type=int, default=None,
                        help="Random seed for reproducibility")
    args = parser.parse_args()

    if args.seed is not None:
        np.random.seed(args.seed)

    print(f"Estimating Pi with {args.samples:,} samples...")
    start = time.time()
    pi_estimate = estimate_pi(args.samples)
    elapsed = time.time() - start

    error = abs(pi_estimate - np.pi)
    print(f"  Estimate: {pi_estimate:.8f}")
    print(f"  Actual:   {np.pi:.8f}")
    print(f"  Error:    {error:.8f}")
    print(f"  Time:     {elapsed:.3f}s")

    if args.output:
        results = {
            "samples": args.samples,
            "estimate": pi_estimate,
            "error": error,
            "elapsed_seconds": elapsed,
        }
        with open(args.output, "w") as f:
            json.dump(results, f, indent=2)
        print(f"  Results written to {args.output}")

if __name__ == "__main__":
    main()

Test it locally to make sure it works:

$ python estimate_pi.py 100000

The Requirements File

Create requirements.txt with pinned versions for reproducibility:

numpy==1.24.3

Why pin versions?

Using numpy==1.24.3 instead of just numpy ensures that anyone building this container — today or six months from now — gets the exact same version. Without pinning, pip install numpy installs whatever the latest version is at build time, which can break your code if the API changes.

Writing the Dockerfile

Now we translate the manual setup steps into a Dockerfile. Create a file called Dockerfile (no extension):

FROM python:3.11-slim

RUN apt-get update \
    && apt-get install -y --no-install-recommends build-essential \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt /app/requirements.txt
RUN pip install --no-cache-dir -r /app/requirements.txt

COPY estimate_pi.py /app/estimate_pi.py
RUN chmod +x /app/estimate_pi.py

ENV PATH="/app:$PATH"

WORKDIR /app

CMD ["python", "estimate_pi.py"]

Let's walk through each instruction:

FROM python:3.11-slim — We start from the official Python 3.11 image, using the -slim variant which strips out compilers and docs we don't need. This is ~125 MB instead of ~1 GB for the full image. Avoid python:latest — the meaning of "latest" changes over time, making your builds non-reproducible.

RUN apt-get update && apt-get install ... — Installs system-level build tools that numpy needs for compilation. The --no-install-recommends flag skips optional packages, and rm -rf /var/lib/apt/lists/* cleans up the package cache. These are combined into a single RUN instruction because each RUN creates a new image layer — combining them keeps the image smaller.

COPY requirements.txt ... then RUN pip install ... — We copy the requirements file before the application code. This is intentional: Docker caches each layer, and layers only rebuild when their inputs change. If you change estimate_pi.py but not requirements.txt, Docker reuses the cached pip install layer instead of reinstalling all dependencies. This can save minutes on every rebuild.

COPY estimate_pi.py ... then RUN chmod +x ... — Copies the application code and makes it executable.

ENV PATH="/app:$PATH" — Adds /app to the system PATH so you can run estimate_pi.py directly without specifying the full path.

WORKDIR /app — Sets the working directory. Any subsequent commands (and the default command when the container starts) run from this directory.

CMD ["python", "estimate_pi.py"] — The default command. When you run the container without specifying a command, it executes this. You can override it at runtime by appending a different command.

Building the Image

From the ~/monte-carlo directory (where your Dockerfile lives), run:

$ docker build --platform linux/amd64 -t ianwasukira/monte-carlo:0.1 .

--platform tells docker to build the image for a specific target operating platform, Intel/AMD Linux systems
-t ianwasukira/monte-carlo:0.1 specifies the account docker should push the image to & tags the image with a name and version
. tells Docker to use the current directory as the build context (it looks for a file named Dockerfile here)

[+] Building 0.5s (10/10) FINISHED                                                                                                                                                             docker:desktop-linux
 => [internal] load build definition from Dockerfile                                                                                                                                                           0.0s
 => => transferring dockerfile: 244B                                                                                                                                                                           0.0s
 => [internal] load metadata for docker.io/library/ubuntu:24.04                                                                                                                                                0.4s
 => [internal] load .dockerignore                                                                                                                                                                              0.0s
 => => transferring context: 2B                                                                                                                                                                                0.0s
 => [1/5] FROM docker.io/library/ubuntu:24.04@sha256:d1e2e92c075e5ca139d51a140fff46f84315c0fdce203eab2807c7e495eff4f9                                                                                          0.0s
 => => resolve docker.io/library/ubuntu:24.04@sha256:d1e2e92c075e5ca139d51a140fff46f84315c0fdce203eab2807c7e495eff4f9                                                                                          0.0s
 => [internal] load build context                                                                                                                                                                              0.0s
 => => transferring context: 92B                                                                                                                                                                               0.0s
 => CACHED [2/5] RUN apt-get update && apt-get upgrade -y                                                                                                                                                      0.0s
 => CACHED [3/5] RUN apt-get install -y python3                                                                                                                                                                0.0s
 => CACHED [4/5] COPY pi.py /code/pi.py                                                                                                                                                                        0.0s
 => CACHED [5/5] RUN chmod +rx /code/pi.py                                                                                                                                                                     0.0s
 => exporting to image                                                                                                                                                                                         0.0s
 => => exporting layers                                                                                                                                                                                        0.0s
 => => exporting manifest sha256:02282c6a660f0cddec59ad57f11bad9be4fa447572128d5a50109d8e8359a478                                                                                                              0.0s
 => => exporting config sha256:d8631be9d908addfd25f392d8cbf70240021c301bc79553e48d6af70067c8450                                                                                                                0.0s
 => => exporting attestation manifest sha256:e2a61ec11f0ea3d3b8c9d38e65039e1c0b9a74f5059478c7b5dcc73d449ee288                                                                                                  0.0s
 => => exporting manifest list sha256:efa123ce9ceaac09819d6f6a0d591c5b39209b61ba4a8d3657a8eac08edfcbd8                                                                                                         0.0s
 => => naming to docker.io/ianwasukira/monte-carlo:0.1                                                                                                                                                         0.0s

You should see Docker execute each instruction. Subsequent builds will be much faster because of layer caching — Docker skips unchanged layers.

Verify the image was created:

$ docker images monte-carlo

REPOSITORY    TAG    IMAGE ID       CREATED          SIZE
monte-carlo   0.1    a1b2c3d4e5f6   30 seconds ago   198MB

Testing the Container

Run the container with the default command:

$ docker run --rm monte-carlo:0.1

The --rm flag removes the container after it exits (otherwise stopped containers accumulate). You should see output like:

Estimating Pi with 1,000,000 samples...
  Estimate: 3.14132400
  Actual:   3.14159265
  Error:    0.00026865
  Time:     0.024s

Override the default to pass arguments:

$ docker run --rm monte-carlo:0.1 python estimate_pi.py 10000000 --seed 42

Write output to a file using a bind mount — this maps a directory on your host into the container so data persists after the container exits:

$ mkdir -p ~/monte-carlo/output

$ docker run --rm \
    -v ~/monte-carlo/output:/output \
    monte-carlo:0.1 \
    python estimate_pi.py 5000000 --output /output/results.json

$ cat ~/monte-carlo/output/results.json

Start an interactive shell inside the container to explore:

$ docker run --rm -it monte-carlo:0.1 /bin/bash

From inside the container, you can run python, check installed packages with pip list, inspect the filesystem, and so on. Type exit to leave.

Using .dockerignore

When Docker builds an image, it sends the entire build context (the . directory) to the Docker daemon. If your project directory contains large data files, .git history, or output files, these are unnecessarily copied and slow down the build.

Create a .dockerignore file to exclude them:

.git
__pycache__
*.pyc
output/
*.log

This works exactly like .gitignore — any matching files are excluded from the build context.

Pushing to a Container Registry

Your image currently lives only on your local machine. To use it on ACE HPC, push it to a container registry.

Docker Hub

# Log in (you'll be prompted for your Docker Hub credentials)
$ docker login

# Tag the image with your Docker Hub username
$ docker tag monte-carlo:0.1 ianwasukira/monte-carlo:0.1

# Push
$ docker push ianwasukira/monte-carlo:0.1

The push refers to repository [docker.io/ianwasukira/monte-carlo]
b73415306fae: Pushed
4f4fb700ef54: Mounted from ianwasukira/pi-estimator
66a4bbbfab88: Pushed
3fc31d9c7e98: Pushed
3cc646d3f0dd: Pushed
9dd65c448696: Pushed
0.1: digest: sha256:7e702eccde1d6c730f65a47d45ce7078a54519a232dd6b84137ba14f2f4705f4 size: 856

Navigate to Docker Hub on your preferred web browser, and go to your repositories.

Image Pushed to DockerHub

Alternatively, you can use the GitHub Coontainer Registry (GHCR);

GitHub Container Registry

# Log in using a personal access token
$ echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin

# Tag for GHCR
$ docker tag monte-carlo:0.1 ghcr.io/yourusername/monte-carlo:0.1

# Push
$ docker push ghcr.io/yourusername/monte-carlo:0.1

Pulling on ACE HPC

Once pushed, Apptainer can convert the Docker image to its native .sif format. apptainer pull downloads and converts the image — the result is a single portable file you can reference in your job scripts:

$ module load apptainer
$ apptainer pull docker://iawasukira/monte-carlo:0.1
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
INFO:    Fetching OCI image...
28.4MiB / 28.4MiB [=============================================================================================================================================================================] 100 % 3.5 MiB/s 0s
13.2MiB / 13.2MiB [=============================================================================================================================================================================] 100 % 3.5 MiB/s 0s
41.1MiB / 41.1MiB [=============================================================================================================================================================================] 100 % 3.5 MiB/s 0s
INFO:    Extracting OCI image...
INFO:    Inserting Apptainer configuration...
INFO:    Creating SIF file...
INFO:    To see mksquashfs output with progress bar enable verbose logging

This creates a file called monte-carlo_0.1.sif. See the next section for how to pull and run this correctly inside a SLURM job.

Running Containers with SLURM

Do not run containers on the head node

The commands shown above (apptainer pull and apptainer run) are for illustration only. Do not run them directly on the head node. The head node is a shared login environment — running compute workloads there slows it down for everyone and may get your session terminated by the system administrators.

All container execution must happen inside a SLURM job script submitted to the compute cluster.

The apptainer pull step (converting the Docker image to a .sif file) can be resource-intensive and should also be done inside a job. Here is the correct approach for pulling and running the Monte Carlo container:

#!/bin/bash
#SBATCH --job-name=monte-carlo
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --time=01:00:00
#SBATCH --output=monte-carlo-%j.out
#SBATCH --error=monte-carlo-%j.err

module load apptainer

# Pull the image from Docker Hub (only needed once; skip if .sif already exists)
if [ ! -f monte-carlo_0.1.sif ]; then
    apptainer pull docker://ianwasukira/monte-carlo:0.1
fi

# Run the simulation inside the container
apptainer run monte-carlo_0.1.sif 1000000 --output results.json

Submit it with:

$ sbatch monte-carlo.sh

The full workflow for running containers on ACE HPC, including bind mounts and GPU access, is covered in Containers on HPC Clusters.

Next: Containers on HPC Clusters — pull your images on ACE HPC with Apptainer, write SLURM job scripts, and run MPI and GPU workloads.

What is a Dockerfile?​

The Example: A Monte Carlo Simulation​

Project Setup​

The Application Code​

The Requirements File​

Writing the Dockerfile​

Building the Image​

Testing the Container​

Using .dockerignore​

Pushing to a Container Registry​

Docker Hub​

GitHub Container Registry​

Pulling on ACE HPC​

Running Containers with SLURM​