Transpilation
This page explains how the transpiler bridges a standard RISC-V ELF and the ZisK ISA. It covers why transpilation runs at the start of every emulation rather than being cached, how ZisK's 8-bit instruction alignment leaves room to expand awkward RISC-V instructions, a worked layout example, and the address map the transpiled ROM, input data, and RAM live in at runtime.
The ISA & processor page described the shape the ZisK
processor expects every instruction in: c = op(a, b) with four
specific-purpose registers. But the binary you compile with
cargo-zisk build is still a standard RISC-V ELF — RV64IMA, the
hardware-style ISA. Something has to bridge those two worlds, and
that something is the transpiler.
Why transpile at runtime?
Transpilation runs at the start of every emulation, not once and cached. That sounds expensive, but in practice it's no slower than reading a previously transpiled ZisK binary from disk:
| Approach | What it costs |
|---|---|
| Cache the transpiled ZisK ROM | Disk I/O proportional to ROM size. Plus a versioning/invalidation story every time the transpiler changes. |
| Re-transpile from the RISC-V ELF | CPU time proportional to ELF size. Cheap enough to be unnoticeable. No cache invalidation, no stale-output footguns. |
Re-transpiling is the path of least friction: the RISC-V ELF is already the canonical artifact (the verification key is anchored to its transpiled form), so making the in-memory ZisK ROM a deterministic function of it removes a whole category of "which file is the real one?" questions.
The output of the transpiler is a ZisK ROM: a map from program counter to ZisK instruction, ready for the emulation loop in ISA & processor → How an instruction is emulated to walk through.
Alignment
The translation between the two ISAs is mostly 1:1 — one RISC-V instruction becomes one ZisK instruction — but a handful of RISC-V instructions don't have a single-instruction ZisK equivalent, and the alignment scheme is designed to absorb the slack:
| ISA | Instruction width | Alignment |
|---|---|---|
| RV64IMA | 32 bits | 32-bit |
| RV64IMAC (with the C extension) | 16 or 32 bits | 16-bit |
| ZisK | variable | 8-bit |
ZisK's 8-bit alignment is the key choice. It means every RISC-V instruction's address — 32-bit aligned, so a multiple of 4 — has room for four ZisK instructions before colliding with the next one. That headroom is exactly what the awkward cases need:
| RISC-V instruction class | How it transpiles |
|---|---|
| Most RV64IMA instructions | One ZisK instruction at the same address. The byte slots in between stay empty. |
| Atomic instructions | 2–4 ZisK instructions packed into the slots starting at the RISC-V instruction's address. |
| C-extension instructions | Usually one ZisK instruction; sometimes two when the 16-bit encoding implies more work than a single ZisK op can express. |
The same scheme also makes it impossible to confuse one program's transpiled code with another's: ZisK addresses match RISC-V addresses everywhere a base RISC-V instruction lives, so RISC-V branch targets land on real ZisK instructions without any remapping table.
Example
Take a snippet of RISC-V code starting at address 0, with one
instruction of each kind for illustration:
The transpiler walks each RISC-V instruction and emits one or more ZisK instructions at the same address, leaving the in-between slots empty:
ZisK instructions don't actually have a fixed bit length — the diagram represents them as 8-bit because that's the alignment granularity. The "no code" gaps are exactly the slack that absorbs the variable-width RISC-V layout.
The address map
The transpiled ROM doesn't live alone in the address space. The
ZisK runtime (ziskos) reserves a few regions for its own
plumbing — entry/exit handlers, input data, RAM. Knowing where
each region sits is useful when reading a guest program's
disassembly or chasing down a memory issue.
High level
| From | To | Region |
|---|---|---|
0x1000 | 0x10000000 | ziskos ROM |
0x40000000 | 0x7FFFFFFF | Input data |
0x80000000 | 0x88000000 | Program ROM |
0xA0000000 | 0xBFFFFFFF | RAM |
The four regions split cleanly by purpose: ziskos ROM is
runtime support, the input region is where the host's input
bytes are placed before execution begins, the program ROM is
where the transpiled guest lives, and RAM is read/write working
memory.
Inside the ROM region
| Address | Symbol | What's there |
|---|---|---|
0x1000 | ROM_ENTRY | Where execution starts. Jumps to the program launcher. |
0x1004 | ROM_EXIT | Where execution ends. Returning here halts the program. |
0x1008 | FLOAT_HANDLER_ADDR | Soft-float trampoline. Saves the RISC-V register file, calls into the float library, restores the register file. Used whenever an F or D instruction is transpiled. |
0x1110 | Program launcher | Writes initial values for ROM and RAM globals, then jumps to the first guest instruction at 0x80000000. Also hosts the syscall trap handler, including the exit sequence that commits public outputs and jumps to ROM_EXIT. |
0x80000000 | Program code | The transpiled guest. Execution ends with a syscall that returns control to the launcher's exit path. |
0x87F00000 | Float soft-library | The code the soft-float trampoline jumps into. |
Inside the RAM region
| Address | Symbol | What's there |
|---|---|---|
0xA0000000 | RAM_ADDR / SYS_ADDR | ziskos system RAM. |
0xA0010000 | OUTPUT_ADDR | The buffer the guest writes its public outputs into. After proving, the host reads from here. |
0xA0030000 | Program RAM | The guest's general-purpose heap and stack. |
0xBFFF0000 | Float soft-library RAM | Scratch space the soft-float code uses. |
Where this picks up
You now know what runs (the ISA) and what it runs on (the processor and the transpiled ROM). The next page, Arithmetization, opens up the layer beneath: how an execution becomes a system of polynomial constraints the prover can prove.