Transpilation

This page explains how the transpiler bridges a standard RISC-V ELF and the ZisK ISA. It covers why transpilation runs at the start of every emulation rather than being cached, how ZisK's 8-bit instruction alignment leaves room to expand awkward RISC-V instructions, a worked layout example, and the address map the transpiled ROM, input data, and RAM live in at runtime.

The ISA & processor page described the shape the ZisK processor expects every instruction in: c = op(a, b) with four specific-purpose registers. But the binary you compile with cargo-zisk build is still a standard RISC-V ELF — RV64IMA, the hardware-style ISA. Something has to bridge those two worlds, and that something is the transpiler.

Why transpile at runtime?

Transpilation runs at the start of every emulation, not once and cached. That sounds expensive, but in practice it's no slower than reading a previously transpiled ZisK binary from disk:

Approach	What it costs
Cache the transpiled ZisK ROM	Disk I/O proportional to ROM size. Plus a versioning/invalidation story every time the transpiler changes.
Re-transpile from the RISC-V ELF	CPU time proportional to ELF size. Cheap enough to be unnoticeable. No cache invalidation, no stale-output footguns.

Re-transpiling is the path of least friction: the RISC-V ELF is already the canonical artifact (the verification key is anchored to its transpiled form), so making the in-memory ZisK ROM a deterministic function of it removes a whole category of "which file is the real one?" questions.

The output of the transpiler is a ZisK ROM: a map from program counter to ZisK instruction, ready for the emulation loop in ISA & processor → How an instruction is emulated to walk through.

Alignment

The translation between the two ISAs is mostly 1:1 — one RISC-V instruction becomes one ZisK instruction — but a handful of RISC-V instructions don't have a single-instruction ZisK equivalent, and the alignment scheme is designed to absorb the slack:

ISA	Instruction width	Alignment
RV64IMA	32 bits	32-bit
RV64IMAC (with the C extension)	16 or 32 bits	16-bit
ZisK	variable	8-bit

ZisK's 8-bit alignment is the key choice. It means every RISC-V instruction's address — 32-bit aligned, so a multiple of 4 — has room for four ZisK instructions before colliding with the next one. That headroom is exactly what the awkward cases need:

RISC-V instruction class	How it transpiles
Most RV64IMA instructions	One ZisK instruction at the same address. The byte slots in between stay empty.
Atomic instructions	2–4 ZisK instructions packed into the slots starting at the RISC-V instruction's address.
C-extension instructions	Usually one ZisK instruction; sometimes two when the 16-bit encoding implies more work than a single ZisK op can express.

The same scheme also makes it impossible to confuse one program's transpiled code with another's: ZisK addresses match RISC-V addresses everywhere a base RISC-V instruction lives, so RISC-V branch targets land on real ZisK instructions without any remapping table.

Example

Take a snippet of RISC-V code starting at address 0, with one instruction of each kind for illustration:

The transpiler walks each RISC-V instruction and emits one or more ZisK instructions at the same address, leaving the in-between slots empty:

ZisK instructions don't actually have a fixed bit length — the diagram represents them as 8-bit because that's the alignment granularity. The "no code" gaps are exactly the slack that absorbs the variable-width RISC-V layout.

The address map

The transpiled ROM doesn't live alone in the address space. The ZisK runtime (ziskos) reserves a few regions for its own plumbing — entry/exit handlers, input data, RAM. Knowing where each region sits is useful when reading a guest program's disassembly or chasing down a memory issue.

High level

From	To	Region
`0x1000`	`0x10000000`	`ziskos` ROM
`0x40000000`	`0x7FFFFFFF`	Input data
`0x80000000`	`0x88000000`	Program ROM
`0xA0000000`	`0xBFFFFFFF`	RAM

The four regions split cleanly by purpose: ziskos ROM is runtime support, the input region is where the host's input bytes are placed before execution begins, the program ROM is where the transpiled guest lives, and RAM is read/write working memory.

Inside the ROM region

Address	Symbol	What's there
`0x1000`	`ROM_ENTRY`	Where execution starts. Jumps to the program launcher.
`0x1004`	`ROM_EXIT`	Where execution ends. Returning here halts the program.
`0x1008`	`FLOAT_HANDLER_ADDR`	Soft-float trampoline. Saves the RISC-V register file, calls into the float library, restores the register file. Used whenever an `F` or `D` instruction is transpiled.
`0x1110`	Program launcher	Writes initial values for ROM and RAM globals, then jumps to the first guest instruction at `0x80000000`. Also hosts the syscall trap handler, including the exit sequence that commits public outputs and jumps to `ROM_EXIT`.
`0x80000000`	Program code	The transpiled guest. Execution ends with a syscall that returns control to the launcher's exit path.
`0x87F00000`	Float soft-library	The code the soft-float trampoline jumps into.

Inside the RAM region

Address	Symbol	What's there
`0xA0000000`	`RAM_ADDR` / `SYS_ADDR`	`ziskos` system RAM.
`0xA0010000`	`OUTPUT_ADDR`	The buffer the guest writes its public outputs into. After proving, the host reads from here.
`0xA0030000`	Program RAM	The guest's general-purpose heap and stack.
`0xBFFF0000`	Float soft-library RAM	Scratch space the soft-float code uses.

Where this picks up

You now know what runs (the ISA) and what it runs on (the processor and the transpiled ROM). The next page, Arithmetization, opens up the layer beneath: how an execution becomes a system of polynomial constraints the prover can prove.

Why transpile at runtime?​

Alignment​

Example​

The address map​

High level​

Inside the ROM region​

Inside the RAM region​

Where this picks up​