Deploy with Docker
A step-by-step guideline for the containerised prover stack. A
single docker compose up brings the whole cluster online: one
coordinator, N workers (scaled at the command line), Prometheus
scraping the coordinator, and Grafana with the ZisK dashboard
pre-provisioned. By the end of this page you'll have the full
stack running and a real gcd proof on disk.
When to pick this
Pick this path when you want the cluster bundled as containers rather than as host services or hand-launched binaries. Right shape for:
- A demo or CI smoke test that needs to tear down cleanly.
- A single-host staging environment with Prometheus + Grafana attached out of the box.
- Shipping the prover as part of a wider docker-compose stack.
For a 24/7 production prover under systemd or launchd, see
Deploy as services; for a developer machine where
you just want the binaries on PATH, see
Deploy the binaries directly.
Prerequisites
| Requirement | Notes |
|---|---|
| Linux or macOS | Tested on Ubuntu 22.04+ and macOS 13+ (Apple Silicon or Intel). |
| Docker Engine 24+ | Must include the Compose plugin v2 (docker compose version should print v2.x). |
| ~32 GB RAM | Each worker container preallocates the Assembly emulator's shared regions; the host needs enough headroom. |
ZisK bundle on host (/opt/zisk/) | Worker containers mount the toolchain + proving key read-only from this path. ziskup is the easiest way to populate it. |
| Internet access | First docker compose build pulls base images, Rust toolchain, and crates from the network. |
Open ports 7001, 9090, 9091, 3000 | Coordinator public API, Prometheus, coordinator metrics, and Grafana on the host side (mapped from containers). |
System dependencies (apt / brew packages used to populate
/opt/zisk/) live in the developer install guides:
Linux,
macOS.
Install Docker
Pick your OS — Linux installs Docker Engine directly; macOS uses Docker Desktop, which also includes the Compose plugin.
1. Install Docker Engine. The upstream convenience script is the fastest path on Ubuntu / Debian:
curl -fsSL https://get.docker.com | sudo sh
For other distros or hardened environments, follow the official Docker Engine install docs.
2. Allow your user to run docker without sudo (optional but
recommended):
sudo usermod -aG docker $USER
newgrp docker # apply the group change without logging out
3. Install the Docker Compose v2 plugin. Drop the latest
release into Docker's CLI-plugins directory so docker compose
becomes available as a subcommand:
mkdir -p ~/.docker/cli-plugins
curl -SL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-x86_64 \
-o ~/.docker/cli-plugins/docker-compose
chmod +x ~/.docker/cli-plugins/docker-compose
4. Verify.
docker version
docker compose version # must print "Docker Compose version v2.x"
Install ZisK on the host
The worker containers mount the toolchain + proving-key bundle
read-only from the host. You need ziskup to run on the host
once so /opt/zisk/ exists with the proving key inside. Follow
the developer install guides:
At the ziskup prompt, pick option 1 (Install proving key).
On macOS the bundle lands at
/Library/Application Support/ZisK/ — set ZISK_HOME to that
path before docker compose so the worker containers find it
(the compose file falls back to /opt/zisk when the env var is
unset).
cargo-zisk on the smoke-test hostThe gcd host program in the smoke-test step further down
is built with cargo run and links against the ZisK toolchain.
The same ziskup install puts cargo-zisk on your PATH, so
running ziskup on the host satisfies both the bundle-mount
requirement and the smoke-test toolchain requirement at once.
Clone the repo
The compose file, the coordinator + worker Dockerfiles, and the
Prometheus / Grafana provisioning all live under
distributed/deploy/docker/ in the upstream repo. Clone it once
and run everything from the workspace root:
git clone https://github.com/0xPolygonHermez/zisk.git
cd zisk
The relevant files:
distributed/deploy/docker/
├── compose.yaml # the four-service stack
├── Dockerfile.coordinator # coordinator image
└── Dockerfile.worker # worker image (CPU/GPU via build arg)
distributed/deploy/config/
├── coordinator.toml # mounted RO into the coordinator container
└── worker.toml # mounted RO into each worker replica
Build the images
First-run builds compile the prover stack from source inside the images, which takes several minutes. From the workspace root:
docker compose -f distributed/deploy/docker/compose.yaml build
This builds two images: zisk-coordinator:latest and
zisk-worker:latest (CPU build by default).
For a GPU-accelerated worker, rebuild with the GPU build arg:
docker compose -f distributed/deploy/docker/compose.yaml \
build --build-arg GPU=true worker
The container still needs CUDA exposed via the NVIDIA Container
Toolkit — install it on the host and add the relevant
deploy.resources.reservations.devices block to the worker
service in compose.yaml. See
zisk-worker / Backend selection.
Start the stack
--scale worker=N controls how many worker replicas spawn. Four
is a sane starting point on a beefy host:
docker compose -f distributed/deploy/docker/compose.yaml \
up -d --scale worker=4
-d runs the stack detached. The stack contains:
| Service | Image | Host ports | Notes |
|---|---|---|---|
coordinator | zisk-coordinator:latest | 7001, 9091 | Public gRPC on :7001; Prometheus scrape on :9091. Workers reach it via Docker DNS at coordinator:50052. |
worker | zisk-worker:latest | (none) | Replicas spawned by --scale. Each gets an auto-generated worker_id. Mounts the host's ${ZISK_HOME:-/opt/zisk} RO. |
prometheus | prom/prometheus:v2.51.0 | 9090 | Auto-scrapes the coordinator's /metrics. |
grafana | grafana/grafana:11.2.0 | 3000 | Anonymous Admin, datasource + dashboards auto-provisioned. Dev-only; lock down for any non-local deploy. |
Watch the stack come up:
docker compose -f distributed/deploy/docker/compose.yaml logs -f
The coordinator emits worker registered: <uuid> capacity=10
lines as each worker dials in.
Verify
A /health probe from the host confirms the coordinator is
reachable on the mapped port:
curl -i http://127.0.0.1:9091/health
# → HTTP/1.1 200 OK
Inspect container status:
docker compose -f distributed/deploy/docker/compose.yaml ps
You should see four services (coordinator, worker ×N,
prometheus, grafana) all in the running state, with
coordinator reporting healthy after the start period (it has
a grpc_health_probe healthcheck baked in).
Open Grafana at http://127.0.0.1:3000 — anonymous Admin gets you in without a password, and the pre-provisioned ZisK dashboard is wired to the Prometheus datasource.
Submit a smoke-test job
The repo ships a gcd example whose remote-host binary submits
a real proving job. Run it on the host (not inside a
container) — it'll talk to the coordinator's mapped :7001.
The remote-host binary targets http://localhost:7000 by
default. The Docker stack maps the coordinator to :7001, so
edit examples/gcd/host/src/prover-clients/remote.rs and change
the URL to http://localhost:7001 before running:
cd examples/gcd/host
cargo run --release --bin remote-host
The binary builds a
ProverClient::remote("http://localhost:7001") internally and
uploads the gcd guest ELF. While it runs, tail the
coordinator and worker logs:
docker compose -f distributed/deploy/docker/compose.yaml logs -f coordinator worker
You should see the job move through Queued → Running → Completed, segment-assignment lines, witness generation, and a
job complete message. When remote-host exits cleanly, the
proof is on disk.
Scale the worker pool
Bump --scale worker=N up or down on the fly without touching
the coordinator. Re-running up -d is idempotent — only the
worker count changes:
docker compose -f distributed/deploy/docker/compose.yaml \
up -d --scale worker=8
Each replica gets a unique auto-generated worker_id (the
mounted worker.toml intentionally leaves worker_id unset).
The coordinator distributes segments proportionally to the
advertised compute_units (defaults to 10 per replica; edit
distributed/deploy/config/worker.toml and restart to change).
Manage the stack
Day-two operations are handled by Docker Compose directly. All
commands run from the workspace root, prefixed with
docker compose -f distributed/deploy/docker/compose.yaml.
Inspect status
docker compose -f distributed/deploy/docker/compose.yaml ps
Tail the logs
docker compose -f distributed/deploy/docker/compose.yaml logs -f # all services
docker compose -f distributed/deploy/docker/compose.yaml logs -f coordinator # one service
Restart a service
Pick up a config edit (e.g. after editing coordinator.toml) by
restarting that container:
docker compose -f distributed/deploy/docker/compose.yaml restart coordinator
Stop / start the stack
docker compose -f distributed/deploy/docker/compose.yaml stop # graceful shutdown, containers preserved
docker compose -f distributed/deploy/docker/compose.yaml start # restart the stopped containers
stop sends SIGTERM, which triggers the binaries' graceful
shutdown windows (see
zisk-coordinator / Signals).
Updating
Pull the latest revs, rebuild the affected images, and re-converge:
git pull
docker compose -f distributed/deploy/docker/compose.yaml \
build coordinator worker
docker compose -f distributed/deploy/docker/compose.yaml \
up -d --scale worker=4
compose up -d is idempotent — containers whose image hash
hasn't changed are left alone; only the ones with new images get
recreated.
Tearing down
docker compose -f distributed/deploy/docker/compose.yaml down
Add -v to also remove the Grafana data volume (loses any
dashboard edits you made via the UI):
docker compose -f distributed/deploy/docker/compose.yaml down -v
The built images (zisk-coordinator:latest,
zisk-worker:latest) stay on disk for next time. Remove them
explicitly if you want a clean slate:
docker rmi zisk-coordinator:latest zisk-worker:latest