Arithmetization

This page explains how a ZisK execution becomes a system of polynomial constraints. It introduces AIRs and their trace matrices, the fixed/cached/witness column kinds, interactions and buses, the three argument types (permutation, lookup, copy), bus balancing, and the LogUp reduction that turns global bus balancing into per-AIR partial sums checked at aggregation — all tied together by circuits and ZisK's five system buses.

The previous pages described what a ZisK program looks like and how the machine runs it (ISA & processor), and how a RISC-V binary becomes ZisK code in the first place (Transpilation). This page is about the layer beneath all of that: how the prover turns an execution into something it can actually prove.

The short version is that an execution becomes a system of polynomial constraints, and a valid execution is a satisfying assignment of that system. The long version — what kind of polynomials, over what values, hooked up how — is what fills out this page.

Two layers, one pipeline

Every proving system splits cleanly into two layers, and ZisK is no exception:

Layer	Responsibility
Frontend	Express the computation as a structured mathematical object — typically a system of polynomial constraints over a finite field.
Backend	Take that object and produce a cryptographic proof of its satisfiability.

This page is about the frontend. The backend — Goldilocks-field STARKs — is what the Segments page covered: per-segment proving, aggregation trees, recursion, and the root proof. The two layers are decoupled by design — the same frontend can target different backends, and improvements to either don't require redesigning the other.

ZisK's frontend is shaped by a few goals:

Separation of concerns — each chip can be designed, implemented, and tested on its own.
Composability — chips can be reused across different computations or combined in different configurations.
Scalability — different chips can have different sizes, so resources track actual usage rather than worst-case budgets.
Backend independence — the arithmetization can be paired with any compatible proof system.

All of those are achieved by one core abstraction: the AIR.

AIRs

An Algebraic Intermediate Representation (AIR) describes a computation as a matrix of values plus a set of polynomial constraints that must hold at every row.

   col 0   col 1   col 2     col W-1
  ┌──────┬──────┬──────┬─── ─┬──────┐
0 │ ...  │ ...  │ ...  │     │ ...  │  row 0
1 │ ...  │ ...  │ ...  │ ··· │ ...  │  row 1
2 │ ...  │ ...  │ ...  │     │ ...  │  row 2
  │  ⋮   │  ⋮   │  ⋮   │     │  ⋮   │   ⋮
N-1│ ... │ ...  │ ...  │     │ ...  │  row N-1

         ↑ trace matrix T ∈ F^(N×W)

Constraints are polynomials over the column values at the current row and a few rows ahead (the depth of the constraint), evaluated row by row. Every constraint has to evaluate to zero at every row of a valid trace. The matrix is allowed to be wide (many columns) or tall (many rows), and different AIRs in the same circuit can have different dimensions.

ZisK extends the standard definition with AIR values — per-instance scalars that aren't indexed by row. AIR values are how global constants like public inputs and aggregation roots become first-class citizens of the constraint system rather than being awkwardly encoded as per-row data.

The formal definition (whitepaper §4.1) is in a footnote worth, but the idea is just the table above plus "the constraints can also see a small per-instance scalar vector." N is always a power of two because the backend uses a low-degree test over a multiplicative subgroup, and subgroup sizes have to divide a power of two.

Three kinds of column

Not every column has the same lifecycle. ZisK distinguishes three kinds, and the distinction is what lets the proving system avoid redundant work and stay precise about what the verifier learns:

Kind	Determined…	Visible to verifier?	Typical use
Fixed	At setup time	Yes — part of the AIR's description.	Lookup tables, instruction encodings.
Cached	Per program, not per input	No (committed once, reused across all proofs of that program).	The transpiled ROM instruction trace.
Witness	Per execution	No (committed per proof).	Register values, memory accesses, intermediate results.

A fixed column is the same for every execution anywhere — there's no point committing it again per proof. A cached column depends on the program (its ELF) but not on the inputs — committed once at setup, reused across all proofs of that program. A witness column carries the specific execution and gets freshly committed every time.

When an AIR has fixed columns, the trace height N is also fixed at setup.

Interactions and buses

A single AIR is self-contained — its constraints can only reference columns within its own trace. They can't reach over to another AIR's columns. But realistic computations cross AIR boundaries constantly: Main fetches an instruction from the ROM AIR, reads a value the Memory AIR wrote, dispatches a hash to a precompile, queries a base-operation chip's lookup table.

ZisK expresses those relationships without merging every AIR into one monolithic constraint system. The mechanism is interactions — row-level messages one AIR sends and another receives, mediated by a shared channel called a bus.

Each interaction is a tuple (b, m, e):

Component	What it is
`b`	The bus identifier — a fixed field element selecting which bus the interaction is on.
`m`	The multiplicity polynomial — at each row, a signed integer indicating how many times this row sends (positive) or receives (negative) the message.
`e = (e_1, …, e_ℓ)`	The message polynomials — at each row, the ℓ values being communicated.

Both m and e_j are polynomials over the same variables as the AIR's constraints, so they can depend on the trace and on AIR values. A typical sender row has m(r) = +1 and e(r) equal to the message payload; a typical receiver row has m(r) = -1 and e(r) equal to the same payload:

Bus balancing

What ties all those messages together is bus balancing — the single condition that across all AIRs and all rows, the signed sum of m(r) · e(r) on each bus must be zero. Equivalently, every message sent on a bus is received exactly once with matching content. Gaps, duplications, and reorderings all surface as a non-zero sum.

That one condition is the foundation of every cross-AIR argument in ZisK — lookups, permutations, table queries, instruction dispatches, memory consistency. They're all different patterns of "send this, receive that, multiplicity matters."

Arguments

Interactions define the communication structure; arguments specify the semantic relationship being enforced. An argument is realised as a pair of interactions on a shared bus — one AIR sending with positive multiplicity, another receiving with negative multiplicity — chosen so that bus balancing becomes equivalent to the desired relationship.

ZisK uses three argument types:

Argument	What it asserts	Used for
Permutation	Two sequences are permutations of each other — same elements with the same multiplicities, possibly in different order.	Memory's sorted access log; reorderings that preserve content.
Lookup	Every element of a query sequence appears in a table sequence (no length or multiplicity match required).	Range checks, table-driven computations, opcode validation.
Copy	Designated cells across different parts of a witness hold equal values. Implemented as a permutation argument on the cells within each equivalence class.	Wiring intermediate values between AIRs without duplicating them.

The Memory chip's sorted-access log is the canonical permutation example. "This byte is in [0, 255]" is the canonical lookup. And the result of an ALU operation matching the value written to a register is the canonical copy argument.

The LogUp reduction

There's a problem with checking bus balancing naïvely: iterating over every row of every AIR, summing messages, and verifying the totals would force the verifier to re-process the entire witness. That defeats the whole point of succinct proofs.

ZisK uses LogUp to reduce bus balancing to a small set of polynomial constraints the backend can verify in time independent of the trace size. The reduction boils down to two ideas applied in sequence.

First, compress every multi-element message into one field element. Using a verifier-supplied challenge γ, the message vector e(r) = (e_1(r), …, e_ℓ(r)) folds into a single value:

ê(r) = b + γ¹·e_1(r) + γ²·e_2(r) + ⋯ + γ^ℓ·e_ℓ(r)

The bus identifier b is included to prevent cancellation across different buses.

Second, reduce multiset equality to a rational identity. With another challenge α, two multisets S and R are equal if and only if

∑_{s ∈ S}  1/(α + s)   =   ∑_{r ∈ R}  1/(α + r)

This identity is hard to satisfy by accident: the prover has to commit to its witness before α is sampled, so it can't just construct values that make the equation balance.

Each AIR carries an extra partial-sum column S_i whose final value s_i is that AIR's signed contribution to the bus. Once every AIR has computed its s_i, global bus balancing reduces to one scalar equation:

∑ᵢ  s_i  =  0

Each AIR proof exposes its own s_i as a public output. The aggregation step (covered on Segments → The shared challenge) sums them and verifies the equation at the root:

The property worth stressing: each AIR proves only its own local constraints plus the running partial sum. The cross-AIR consistency check collapses into a single arithmetic equality at the very end of the pipeline.

Circuits

A circuit is the whole construction — a collection of AIRs each paired with its set of interactions. Different AIRs can have different trace heights, and bus identifiers connect any two interactions that should be talking to each other.

A circuit is satisfied when two conditions both hold:

Local satisfaction — each AIR's witness satisfies its own constraints at every row.
Global bus balancing — every bus across the whole circuit balances.

Local satisfaction is verified independently per AIR. Global bus balancing is enforced by the aggregation step at the end of the proving pipeline — the place where every AIR's partial sum lands and the verifier checks they cancel to zero.

The five system buses

ZisK uses five named buses for its cross-chip plumbing:

Bus	What flows on it	Used by
Operation Bus	`(op, a, b, c, flag, extra)` — a delegated operation.	Main → Base Operations / Precompiles.
ROM Bus	`(pc, a_imm, b_imm, op, …)` — an instruction at `pc`.	Main reads from the ROM chip.
Memory Bus	`(op_mem, addr, t_mem, width, data)` — memory access.	Anything that touches memory; Memory chip checks consistency.
Continuation Bus	Per-chip state across segment boundaries.	Chips with persistent state (Main, Memory, …).
Table Buses	Per-table lookup queries.	Instruction executors → their lookup tables.

Each bus has its own message schema. The Specialized chips page describes the chips that send and receive on them, and the Segments page covers how the Continuation Bus carries state across boundaries.

Where this picks up

You now have the vocabulary every other part of the proof system builds on: AIRs and trace matrices, fixed/cached/witness columns, interactions (b, m, e), signed multiplicities, bus balancing, the three argument types, LogUp turning balancing into a polynomial-degree argument with per-AIR partial sums, and circuits as collections of AIRs plus interactions.

The next page, Limits, closes the section with the hard upper bounds the current ZisK setup imposes — how large a program, how long an execution, how much input each circuit can handle — and what fraction of each bound a real workload actually consumes.

Two layers, one pipeline​

AIRs​

Three kinds of column​

Interactions and buses​

Bus balancing​

Arguments​

The LogUp reduction​

Circuits​

The five system buses​

Where this picks up​