Calling WebAssembly from Rust
WebAssembly originated as a lightweight, portable, sandboxed code execution model for the web. Since its early days, people have realized that these qualities were useful outside of the web as well. As the standard has evolved, a wealth of tooling and standards for running WebAssembly ex-browser have appeared.
As I write this in August 2021, embedding WebAssembly involves understanding a handful of moving parts: language proposals at various stages of implementation; consensus emerging around standards like WASI; and maturing toolchains for various languages. In this post, I’ll attempt to give a “lay of the land” on the current state of WebAssembly, as it pertains to embedding it outside of the browser.
Context
This post stems from my notes on design decisions that I encountered when embedding a WebAssembly interpreter into a Rust program, so I’ll start with some context on my use case.
Since January, I have been developing a library for sharing state between Rust programs called Aper. One of the intended uses of Aper is synchronizing state between multiple browsers for real-time multiplayer games or collaboration apps.
Initially, the preferred way to serve an Aper app has been to build a single server binary with both the application logic and the server logic. Deploying an update to application code requires re-deploying the whole server, and doing this when the server is running without interrupting existing sessions is tricky.
So, I had the idea to separate the server code from the application logic. I was already compiling the client code to WebAssembly, making WebAssembly modules a natural unit of deployment for the back-end as well.
The result is Jamsocket, which can take an arbitrary WebAssembly module and provide a WebSocket interface to it (though there’s still some work to be done to satisfy my original motivating use case).
With that in mind, what follows are the design decisions I encountered while implementing Jamsocket.
Picking a runtime
WebAssembly is a standard, not an implementation, so you need to pick an implementation to embed. This is a two-stage decision: first, you need to decide whether you actually want to embed a WebAssembly runtime, or whether you want to embed a JavaScript runtime that supports WebAssembly. In either case, you then have a choice of libraries to build your runtime stack.
WASM, or JS+WASM?
There are certain advantages to embedding a JavaScript runtime. For one thing, you additionally get support for JavaScript for no extra work. Most languages that compile to WebAssembly are built around supporting a JavaScript host environment, so tooling is potentially better. For example, wasm-bindgen
in Rust generates JavaScript that abstracts away the JavaScript/Rust boundary. Plus, many JavaScript runtimes include APIs for things like making an HTTP request or threading out of the box, which are not available in a raw WebAssembly host environment unless you provide them.
The downside of embedding a JavaScript runtime is that it creates another layer of indirection between the host code and the code running in WebAssembly. Even if you write your host code in Rust and your embedded code in Rust, values passed across the boundary would still be converted in and out of JavaScript values. This isn’t just a performance hit, it also creates a larger surface area for bugs to crop up.
My use case was to embed Rust code, so I didn’t want to incur the complexity or overhead of introducing a layer of JavaScript in between, and chose not to use a JavaScript runtime. If I did, I probably would have went with Deno, which is the successor to node.js and happens to be written in Rust.
The runtime stack
The runtime is the code that turns the bytes on disk of a WebAssembly module into instructions for your CPU, runs them, and (optionally) provides an interface for them to interact with other parts of the system. These are actually separable concerns (to an extent), so I find it helpful to think of it as a “runtime stack” instead of a discrete “runtime”.
At the bottom of the stack is code generation. The main contender here is Cranelift, which is used by all of the main Rust runtimes (though Wasmer can also be used without it.)
Next in the stack is the code that invokes the generated bytecode and provides you with a way to make function calls across the sandbox boundary, both calling into the module and calling the host environment from the module. The main contenders here are wasmtime and Wasmer (Lucet is another option, specialized for servers: it ahead-of-time compiles modules to run in a Linux environment). For Jamsocket, I went with wasmtime. I figured I’d be using Cranelift in any case, so wasmtime (which the Cranelift project is a part of) seemed like a good default choice.
Going back to my mental model of the runtime stack, the final (and optional) layer is the system interface. WebAssembly on its own is a self-contained, deterministic runtime; it has no way of directly running syscalls that could expose it to system resources like files, sockets, random entropy, or even read the system clock. One option is to explicitly create and expose a bespoke interface that provides additional capabilities to the module. Another option is to link in a WASI implementation.
In my case, I chose a bit of both. I give modules a bespoke interface for making calls that only make sense in the context of a Jamsocket server, like sending WebSocket messages. For general things like accessing random entropy or the system time, I use WASI rather than reinventing the wheel.
Interfacing with the module
WebAssembly currently has five data types, four representing numbers and one representing an opaque reference from a host system. From the application programmer’s point of view, this lack of type variety is not a concern, since the data types they use are just an abstraction that programming languages and compilers create, and are fairly agnostic to the underlying hardware (or virtual hardware) instruction set’s type system. But when we’re interfacing directly with a WebAssembly module, it becomes our concern, because those are the only types we can pass directly to functions exposed by the module.
On top of that, WebAssembly modules are sandboxed, so they can’t access memory outside a predefined block. Indeed, this is a key selling point of WebAssembly. But it further complicates passing a value in to a WebAssembly function, because we can’t just pass in a pointer to the value into that function in the caller’s memory space.
Instead, the way to pass arbitrary-length data to WebAssembly is a bit clunky. First, we have to expose functions of the module’s internal memory allocator (i.e. malloc
and free
) to the host environment. When the host environment wants to pass a value to a function, it requests that the module allocate the appropriate amount of memory in its linear memory and return a pointer it. Then, the caller (host environment) copies the value to the newly-allocated location in the module’s owned memory. Now, the host has a pointer to the value that the module can access, with which it calls the function. Finally, after the call returns, it’s the caller’s responsibility to free that memory, by calling into an exposed free
function of the module’s allocator.
As a rough sketch of the sequence of operations, passing arbitrary-length data to a WebAssembly function might look like this (I’ve omitted the implementation of the self.*
functions, which will depend on which wasm runtime you’re using):
fn call_wasm_function(&mut self, data: &[u8]) {
let len = data.len() as u32;
// Allocate enough space for `data` in the wasm
// module's linear memory. The implementation of
// `self.malloc` (not shown) will call a `malloc`
// function exposed by the module.
let ptr = self.malloc(len);
// Copy `data` into the module's linear memory at the
// provided location. We do this directly with our
// reference to the module instance's linear memory
// object; this is the only step that does not involve
// calling any functions exported by the module.
self.write_memory(ptr, data);
// This is where we call the actual function. The
// underlying implementation would call a function
// that the wasm module exposes, which will then read
// the data from the appropriate location in its
// memory.
self.call_guest_function(prt, len);
// After the call has completed, we are done and can
// free the memory. This calls a `free` function
// exposed by the module.
self.free(ptr, len);
}
(By the way, Rust’s wasm-bindgen
actually takes a slightly different approach, keeping the allocated memory around for future calls and growing it if necessary. But the essential idea is the same.)
There are a few corollaries to this. The first is that the module is expected to have a memory allocator to expose. WebAssembly lets modules manage their own memory, so a module may not even have an allocator, but it needs to at least be able to fake one for this approach to work. Another corollary is that the module needs to expose its allocator in a way that our host interface understands. The easiest way to achieve this is to provide the user with code that automatically exposes an interface to an allocator (e.g. through a compile-time macro).
Unfortunately, this approach nullifies WebAssembly’s language-agnostic design, because it means you need to write a guest-side interface for any language you want to make embeddable. There are solutions to this in the works, including a procedure call framework (waPC) and the interface types proposal.
For now, I chose to write a Rust macro that generates a WebAssembly interface for a specific trait that module authors can implement, and also takes care of exposing the allocator methods. That way, module authors are entirely abstracted away from the limited type system at the WebAssembly/host boundary.
What’s next?
I’ve touched on a number of improvements on the horizon that will simplify the process of embedding a WebAssembly runtime. In particular, I expect that interface types will streamline a lot of the difficulties that currently arise from passing data across the boundary. Additionally, a proposed garbage collector extension should make life easier when dealing with garbage-collected languages, which may not expose malloc
-level memory management.
Another area I’m eagerly watching is actor-model runtimes on top of WebAssembly. There’s a diverse set of projects building towards variations of this vision, including Lunatic, wasmCloud, Lumen, and Atmo.
One day, I’d like to be able to write an entire application as a set of actors such that the client and server both act as nodes in a distributed system, abstracting away the client/server boundary entirely. Jamsocket is my attempt at making one small step towards that world.