Skip to main content

Deploy with Docker

A step-by-step guideline for the containerised prover stack. A single docker compose up brings the whole cluster online: one coordinator, N workers (scaled at the command line), Prometheus scraping the coordinator, and Grafana with the ZisK dashboard pre-provisioned. By the end of this page you'll have the full stack running and a real gcd proof on disk.

When to pick this

Pick this path when you want the cluster bundled as containers rather than as host services or hand-launched binaries. Right shape for:

  • A demo or CI smoke test that needs to tear down cleanly.
  • A single-host staging environment with Prometheus + Grafana attached out of the box.
  • Shipping the prover as part of a wider docker-compose stack.

For a 24/7 production prover under systemd or launchd, see Deploy as services; for a developer machine where you just want the binaries on PATH, see Deploy the binaries directly.


Prerequisites

RequirementNotes
Linux or macOSTested on Ubuntu 22.04+ and macOS 13+ (Apple Silicon or Intel).
Docker Engine 24+Must include the Compose plugin v2 (docker compose version should print v2.x).
~32 GB RAMEach worker container preallocates the Assembly emulator's shared regions; the host needs enough headroom.
ZisK bundle on host (/opt/zisk/)Worker containers mount the toolchain + proving key read-only from this path. ziskup is the easiest way to populate it.
Internet accessFirst docker compose build pulls base images, Rust toolchain, and crates from the network.
Open ports 7001, 9090, 9091, 3000Coordinator public API, Prometheus, coordinator metrics, and Grafana on the host side (mapped from containers).

System dependencies (apt / brew packages used to populate /opt/zisk/) live in the developer install guides: Linux, macOS.


Install Docker

Pick your OS — Linux installs Docker Engine directly; macOS uses Docker Desktop, which also includes the Compose plugin.

1. Install Docker Engine. The upstream convenience script is the fastest path on Ubuntu / Debian:

curl -fsSL https://get.docker.com | sudo sh

For other distros or hardened environments, follow the official Docker Engine install docs.

2. Allow your user to run docker without sudo (optional but recommended):

sudo usermod -aG docker $USER
newgrp docker # apply the group change without logging out

3. Install the Docker Compose v2 plugin. Drop the latest release into Docker's CLI-plugins directory so docker compose becomes available as a subcommand:

mkdir -p ~/.docker/cli-plugins
curl -SL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-x86_64 \
-o ~/.docker/cli-plugins/docker-compose
chmod +x ~/.docker/cli-plugins/docker-compose

4. Verify.

docker version
docker compose version # must print "Docker Compose version v2.x"

Install ZisK on the host

The worker containers mount the toolchain + proving-key bundle read-only from the host. You need ziskup to run on the host once so /opt/zisk/ exists with the proving key inside. Follow the developer install guides:

At the ziskup prompt, pick option 1 (Install proving key). On macOS the bundle lands at /Library/Application Support/ZisK/ — set ZISK_HOME to that path before docker compose so the worker containers find it (the compose file falls back to /opt/zisk when the env var is unset).

ZisK also needs cargo-zisk on the smoke-test host

The gcd host program in the smoke-test step further down is built with cargo run and links against the ZisK toolchain. The same ziskup install puts cargo-zisk on your PATH, so running ziskup on the host satisfies both the bundle-mount requirement and the smoke-test toolchain requirement at once.


Clone the repo

The compose file, the coordinator + worker Dockerfiles, and the Prometheus / Grafana provisioning all live under distributed/deploy/docker/ in the upstream repo. Clone it once and run everything from the workspace root:

git clone https://github.com/0xPolygonHermez/zisk.git
cd zisk

The relevant files:

distributed/deploy/docker/
├── compose.yaml # the four-service stack
├── Dockerfile.coordinator # coordinator image
└── Dockerfile.worker # worker image (CPU/GPU via build arg)
distributed/deploy/config/
├── coordinator.toml # mounted RO into the coordinator container
└── worker.toml # mounted RO into each worker replica

Build the images

First-run builds compile the prover stack from source inside the images, which takes several minutes. From the workspace root:

docker compose -f distributed/deploy/docker/compose.yaml build

This builds two images: zisk-coordinator:latest and zisk-worker:latest (CPU build by default).

GPU workers

For a GPU-accelerated worker, rebuild with the GPU build arg:

docker compose -f distributed/deploy/docker/compose.yaml \
build --build-arg GPU=true worker

The container still needs CUDA exposed via the NVIDIA Container Toolkit — install it on the host and add the relevant deploy.resources.reservations.devices block to the worker service in compose.yaml. See zisk-worker / Backend selection.


Start the stack

--scale worker=N controls how many worker replicas spawn. Four is a sane starting point on a beefy host:

docker compose -f distributed/deploy/docker/compose.yaml \
up -d --scale worker=4

-d runs the stack detached. The stack contains:

ServiceImageHost portsNotes
coordinatorzisk-coordinator:latest7001, 9091Public gRPC on :7001; Prometheus scrape on :9091. Workers reach it via Docker DNS at coordinator:50052.
workerzisk-worker:latest(none)Replicas spawned by --scale. Each gets an auto-generated worker_id. Mounts the host's ${ZISK_HOME:-/opt/zisk} RO.
prometheusprom/prometheus:v2.51.09090Auto-scrapes the coordinator's /metrics.
grafanagrafana/grafana:11.2.03000Anonymous Admin, datasource + dashboards auto-provisioned. Dev-only; lock down for any non-local deploy.

Watch the stack come up:

docker compose -f distributed/deploy/docker/compose.yaml logs -f

The coordinator emits worker registered: <uuid> capacity=10 lines as each worker dials in.


Verify

A /health probe from the host confirms the coordinator is reachable on the mapped port:

curl -i http://127.0.0.1:9091/health
# → HTTP/1.1 200 OK

Inspect container status:

docker compose -f distributed/deploy/docker/compose.yaml ps

You should see four services (coordinator, worker ×N, prometheus, grafana) all in the running state, with coordinator reporting healthy after the start period (it has a grpc_health_probe healthcheck baked in).

Open Grafana at http://127.0.0.1:3000 — anonymous Admin gets you in without a password, and the pre-provisioned ZisK dashboard is wired to the Prometheus datasource.


Submit a smoke-test job

The repo ships a gcd example whose remote-host binary submits a real proving job. Run it on the host (not inside a container) — it'll talk to the coordinator's mapped :7001.

The remote-host binary targets http://localhost:7000 by default. The Docker stack maps the coordinator to :7001, so edit examples/gcd/host/src/prover-clients/remote.rs and change the URL to http://localhost:7001 before running:

cd examples/gcd/host
cargo run --release --bin remote-host

The binary builds a ProverClient::remote("http://localhost:7001") internally and uploads the gcd guest ELF. While it runs, tail the coordinator and worker logs:

docker compose -f distributed/deploy/docker/compose.yaml logs -f coordinator worker

You should see the job move through Queued → Running → Completed, segment-assignment lines, witness generation, and a job complete message. When remote-host exits cleanly, the proof is on disk.


Scale the worker pool

Bump --scale worker=N up or down on the fly without touching the coordinator. Re-running up -d is idempotent — only the worker count changes:

docker compose -f distributed/deploy/docker/compose.yaml \
up -d --scale worker=8

Each replica gets a unique auto-generated worker_id (the mounted worker.toml intentionally leaves worker_id unset). The coordinator distributes segments proportionally to the advertised compute_units (defaults to 10 per replica; edit distributed/deploy/config/worker.toml and restart to change).


Manage the stack

Day-two operations are handled by Docker Compose directly. All commands run from the workspace root, prefixed with docker compose -f distributed/deploy/docker/compose.yaml.

Inspect status

docker compose -f distributed/deploy/docker/compose.yaml ps

Tail the logs

docker compose -f distributed/deploy/docker/compose.yaml logs -f # all services
docker compose -f distributed/deploy/docker/compose.yaml logs -f coordinator # one service

Restart a service

Pick up a config edit (e.g. after editing coordinator.toml) by restarting that container:

docker compose -f distributed/deploy/docker/compose.yaml restart coordinator

Stop / start the stack

docker compose -f distributed/deploy/docker/compose.yaml stop # graceful shutdown, containers preserved
docker compose -f distributed/deploy/docker/compose.yaml start # restart the stopped containers

stop sends SIGTERM, which triggers the binaries' graceful shutdown windows (see zisk-coordinator / Signals).


Updating

Pull the latest revs, rebuild the affected images, and re-converge:

git pull
docker compose -f distributed/deploy/docker/compose.yaml \
build coordinator worker
docker compose -f distributed/deploy/docker/compose.yaml \
up -d --scale worker=4

compose up -d is idempotent — containers whose image hash hasn't changed are left alone; only the ones with new images get recreated.


Tearing down

docker compose -f distributed/deploy/docker/compose.yaml down

Add -v to also remove the Grafana data volume (loses any dashboard edits you made via the UI):

docker compose -f distributed/deploy/docker/compose.yaml down -v

The built images (zisk-coordinator:latest, zisk-worker:latest) stay on disk for next time. Remove them explicitly if you want a clean slate:

docker rmi zisk-coordinator:latest zisk-worker:latest