Skip to main content

Deploy as services

A step-by-step guideline for the production prover deployment: each binary runs under a real service manager (systemd on Linux, launchd on macOS) behind a dedicated service user, with config in /etc/zisk/ and state on its own writable path. By the end of this page a coordinator and one worker are running as managed services and have proved a real gcd job.

When to pick this

Pick this path when the prover needs to run 24/7 under a real service manager, survive reboots, restart on failure, and route logs to journald (Linux) or /var/log/zisk-*/ (macOS). Right shape for:

  • A bare-metal or VM host that runs the coordinator full-time.
  • One or more worker hosts that need to register at boot.
  • Anything where you want the binary launched by pid 1 instead of by your shell.

For shorter-lived workflows where you launch the binaries by hand, see Deploy the binaries directly; for a containerised stack with Prometheus and Grafana bundled in, see Deploy with Docker.


Prerequisites

RequirementNotes
Linux / macOSLinux uses systemd; macOS uses launchd. No Windows.
sudo accessThe install scripts write to /usr/local/bin, /etc/zisk, and the service manager directory.
Internet accessThe scripts self-bootstrap from raw.githubusercontent.com and pull ziskup for the bundle.
~32 GB RAM (worker hosts)The Assembly emulator preallocates large shared regions on the worker side.
Open ports 7000, 50051, 9090Public API, cluster gRPC, and metrics on the coordinator host. Workers only need outbound to :50051.

System dependencies (apt / brew packages) live in the developer install guides: Linux, macOS.


How the install scripts work

Two scripts ship in the upstream repo, one per binary, and both are curl-pipe-able from a fresh host:

distributed/deploy/scripts/coordinator/install.sh
distributed/deploy/scripts/worker/install.sh

Each script detects the OS and lays things out conceptually the same way, with platform-specific homes:

AssetLinux (systemd)macOS (launchd)
Binary/usr/local/bin/zisk-{coordinator,worker}/usr/local/bin/zisk-{coordinator,worker}
Config/etc/zisk/{coordinator,worker}.toml/etc/zisk/{coordinator,worker}.toml
Service unit/etc/systemd/system/zisk-*.service/Library/LaunchDaemons/com.zisk.*.plist
Writable state/var/lib/zisk-{coordinator,worker}//usr/local/var/zisk-{coordinator,worker}/
Shared bundle/opt/zisk//Library/Application Support/ZisK/
Logsjournald (journalctl -u zisk-*)/var/log/zisk-*/zisk-*.log (rotated by newsyslog)
Service userzisk-coordinator / zisk-workerzisk-coordinator / zisk-worker

Install the coordinator

Run the coordinator install script on the host you've chosen as the coordinator. The OS detection and behaviour differ slightly, so the verification commands branch — pick your OS below.

1. Run the installer.

curl -fsL https://raw.githubusercontent.com/0xPolygonHermez/zisk/main/distributed/deploy/scripts/coordinator/install.sh \
| sudo bash -s -- --api-port 7000

The script, in one pass:

  1. Pulls the shared toolchain into /opt/zisk/ via ziskup --nokey (the coordinator doesn't need the proving key).
  2. Creates the zisk-coordinator system user (home /var/empty, no login).
  3. Installs /usr/local/bin/zisk-coordinator and the example config at /etc/zisk/coordinator.toml.
  4. Provisions /var/lib/zisk-coordinator/ for writable state (includes a cache/ subdir for registered guest ELFs).
  5. Writes a hardened systemd unit at /etc/systemd/system/zisk-coordinator.service with NoNewPrivileges, ProtectSystem=strict, ProtectHome, LimitNOFILE=65535, and Nice=-10.
  6. Runs systemctl enable --now zisk-coordinator — the service is up immediately and survives reboots.

2. Verify the service is up.

sudo systemctl status zisk-coordinator

You should see active (running). Tail the logs to confirm three "listening on" lines for ports 7000, 50051, and 9090:

sudo journalctl -u zisk-coordinator -f

3. Probe the health endpoint.

curl -i http://127.0.0.1:9090/health # → HTTP/1.1 200 OK

Install the worker

Run the worker install script on each host you've chosen as a worker. Point it at the coordinator's cluster port (:50051). Skip this step if the coordinator host is also a worker, then re-run the worker installer on dedicated worker hosts later.

1. Run the installer.

curl -fsL https://raw.githubusercontent.com/0xPolygonHermez/zisk/main/distributed/deploy/scripts/worker/install.sh \
| sudo bash -s -- --coordinator-url http://127.0.0.1:50051

The script mirrors the coordinator install but:

  • Pulls the bundle with the proving key (workers need it to generate proofs).
  • Creates the zisk-worker system user.
  • Drops /usr/local/bin/zisk-worker and the example config at /etc/zisk/worker.toml (with coordinator.url pre-populated to the URL you supplied).
  • Provisions /var/lib/zisk-worker/ for input staging and /var/log/zisk/ for logs.
  • Writes /etc/systemd/system/zisk-worker.service with Restart=on-failure.
  • Runs systemctl enable --now zisk-worker.

2. Verify the worker registered.

sudo systemctl status zisk-worker
sudo journalctl -u zisk-worker -f

In the coordinator's journal (on the coordinator host) you should see:

INFO worker registered: UUID capacity=10
Useful install-script flags

Both scripts accept the same operational knobs regardless of OS. Pass them after bash -s --:

FlagEffect
--binary PATHInstall a pre-built binary instead of pulling via ziskup.
--config PATHInstall your own TOML instead of the bundled sample.
--no-startInstall but don't start (useful for staged rollouts).
--uninstall -yReverse the install (see Uninstalling).

Coordinator-only: --api-port, --cluster-port, --metrics-port. Worker-only: --coordinator-url, --worker-id, --compute-capacity, --proving-key, --with-snark, --gpu / --cpu, --mpi / --no-mpi.


Configure the coordinator

The install script wrote a working example at /etc/zisk/coordinator.toml. The four tables it reads:

TablePurpose
[service]Identity used in log labels. environment is a free-form tag (development / staging / production).
[server]The public gRPC API where clients submit jobs. shutdown_timeout_seconds is the drain window applied on SIGTERM.
[coordinator]The cluster gRPC port workers dial to register and stream segments.
[metrics]GET /metrics returns Prometheus text; GET /health returns 200 OK while alive.
[logging]pretty for human-friendly stdout; json for log aggregators (Loki, Datadog).
[backend]Internal — must stay coordinator.

Edit /etc/zisk/coordinator.toml with sudo nano (or your editor of choice), then restart the service to pick up the change:

sudo nano /etc/zisk/coordinator.toml
sudo systemctl restart zisk-coordinator
sudo journalctl -u zisk-coordinator -n 20 # check it came back clean

Configure the worker

The install script wrote /etc/zisk/worker.toml with coordinator.url already pointing at the URL you passed on the command line. The four tables it reads:

TablePurpose
[worker]Identity + capacity. compute_units is an abstract weight the coordinator uses to assign segments.
[coordinator]Where the worker dials (already populated by the install script).
[connection]How the worker reacts to network blips — reconnect backoff and heartbeat deadline.
[logging]Same shape as the coordinator's.

There's intentionally no [backend] table — backend selection (Assembly emulator, Rust emulator, GPU, Plonk) is done via flags in the unit's ExecStart. See zisk-worker for every flag.

Edit and restart:

sudo nano /etc/zisk/worker.toml
sudo systemctl restart zisk-worker
sudo journalctl -u zisk-worker -n 20

Submit a smoke-test job

The repo ships a gcd example whose remote-host binary submits a real proving job over the public gRPC API. Run it from any host that can reach the coordinator's :7000 — typically a developer machine, not the service-managed coordinator / worker host.

ZisK must be installed on the client host

The install scripts don't put cargo-zisk on the client host, and the gcd example needs it to compile. Install via ziskup on the client. See the Linux / macOS guides. No proving key needed on the client, only the toolchain.

Then clone the repo if you haven't already and run the example:

# Skip the clone if you already have one
[ -d zisk ] || git clone https://github.com/0xPolygonHermez/zisk.git
cd zisk/examples/gcd/host
cargo run --release --bin remote-host

The binary builds a ProverClient::remote("http://localhost:7000") internally — the same code path your own host applications use. While it runs, watch the coordinator and worker journals:

  • The coordinator journal shows the job lifecycle (Queued → Running → Completed), segment assignments, and the promoted-aggregator pick.
  • The worker journal shows witness generation, partial proofs, and a job complete message.

When remote-host exits cleanly, the proof is on disk.


Manage the services

Day-two operations are handled by the service manager directly, not by re-running the install script. The commands below show both binaries' coordinator examples — swap zisk-coordinator for zisk-worker (Linux) or com.zisk.coordinator for com.zisk.worker (macOS) to operate on the worker side.

Inspect status

Check whether the service is running, when it last started, and the exit status of its previous run.

sudo systemctl status zisk-coordinator

Look for Active: active (running) and a recent Started timestamp. A failed state means the unit exited and didn't restart — the journal tail below shows why.

Tail the logs

Watch the service's stdout / stderr in real time. Use this when running a smoke test, debugging a misconfig, or tracking down why a worker isn't registering.

stdout/stderr go to journald:

sudo journalctl -u zisk-coordinator -f

Start

Bring a stopped service up. Both install scripts auto-start the service, so you only reach for this after a manual stop or an install run with --no-start.

sudo systemctl start zisk-coordinator

Stop

Gracefully shut the service down. The service manager sends SIGTERM to the binary, which triggers the drain windows described on zisk-coordinator / Signals and zisk-worker / Signals: the coordinator stops accepting new jobs and drains in-flight work; the worker stops accepting assignments but lets its in-flight segments finish.

sudo systemctl stop zisk-coordinator

Restart

Pick up a config edit or push a newly-installed binary into the running process. Equivalent to a stop followed by a start, issued as one command so the gap is minimised.

sudo systemctl restart zisk-coordinator

Disable / re-enable on boot

Stop the service from starting at boot, but leave the unit / plist on disk so you can re-enable later without re-running the installer.

sudo systemctl disable zisk-coordinator
sudo systemctl enable zisk-coordinator # re-enable for next boot

disable removes the symlinks from /etc/systemd/system/*.target.wants/ but leaves the unit file in place. Pair with stop if the service is currently running.


Updating

Both install scripts are idempotent — re-running them in place picks up the latest binary, refreshes the bundle via ziskup, and rewrites the unit/plist. Existing config under /etc/zisk/ is not touched. Pass --no-start to stage the new binary without bouncing the service, then restart on your own schedule:

# Coordinator
curl -fsL https://raw.githubusercontent.com/0xPolygonHermez/zisk/main/distributed/deploy/scripts/coordinator/install.sh \
| sudo bash -s -- --no-start
sudo systemctl restart zisk-coordinator

# Each worker host
curl -fsL https://raw.githubusercontent.com/0xPolygonHermez/zisk/main/distributed/deploy/scripts/worker/install.sh \
| sudo bash -s -- --no-start
sudo systemctl restart zisk-worker

For a rolling fleet update, drain one worker at a time (stopping the worker service lets its in-flight segments finish via the reconnect grace period), upgrade, restart, then move on.


Uninstalling

Each install script accepts --uninstall, which reverses exactly what it installed: stops and disables the service, removes the unit/plist, deletes the binary, and removes the state and config directories.

# Coordinator
curl -fsL https://raw.githubusercontent.com/0xPolygonHermez/zisk/main/distributed/deploy/scripts/coordinator/install.sh \
| sudo bash -s -- --uninstall -y

# Each worker host
curl -fsL https://raw.githubusercontent.com/0xPolygonHermez/zisk/main/distributed/deploy/scripts/worker/install.sh \
| sudo bash -s -- --uninstall -y

-y skips the confirmation prompt. The shared bundle under /opt/zisk/ (Linux) or /Library/Application Support/ZisK/ (macOS) is left in place because another binary on the host may still depend on it. Remove it by hand only after every ZisK service on the host is gone:

sudo rm -rf /opt/zisk # Linux
sudo rm -rf "/Library/Application Support/ZisK" # macOS