Moving data from Rust to JS

April 02, 2024

Suppose you have some structured data in your Rust WebAssembly program that you want to expose to JavaScript. There are different ways to do it with different tradeoffs, and in this post I dive a bit into them and sketch out a new better way.

For a concrete example, in retrowin32's debugger there's a disassembly view (click 'step' once in that UI to see) that renders some assembly using React. That is not just a glob of plain text, but rather structured data where e.g. the addresses within the instruction stream are marked and hyperlinked (try clicking an address in the disassembly), to either step the program to an instruction or view a memory dump at an address.

Recall (perhaps from my notes) that memory within the WebAssembly program is a big Uint8Array from JavaScript's perspective. At the core all approaches involve making the JS read that memory, just with different patterns.

The first option is to keep the structure in Rust and read out each field via individual JS calls. This section of the wasm-bindgen docs has an example of what it looks like. This is fairly verbose to set up (you need to write out a getter for each field), fairly chatty across the JS-Wasm boundary, and not great from a React perspective (because React prefers plain data objects — consider how a complex nested structure in particular works with this approach) but can be appropriate in contexts where it's a complex object on the Rust side.

If not that, then the other approaches are variations on copying the data to JS. Copying feels bad but makes sense in particular when the Rust code is generating data only for the JS side to use anyway; e.g. in the above dissassembly, there's no need for the Rust side to hang on to any of it. And no matter what, any string that traverses the boundary from WebAssembly to JS must be copied, as JS strings own their data and cannot be views into Wasm memory.

The simplest approach for copying is to just serialize the whole thing as JSON. Concretely this means the Rust side generates a big blob of JSON, then JS copies that blob to a JS string, then JSON.decode()es it. This is less bad than it sounds in that it at least processes all the data in bulk; e.g. you only have to run one block of bytes through the Wasm→JS string decoder. And browsers have poured effort into fast JSON parsing. But it's also worse than it sounds in that it means you must generate JSON on the Rust side. If you're not doing that already means pulling in a bunch of serialization code. At some level it just feels wrong, to serialize structured data into a string only to immediately parse that string.

Alternatively you could have the Rust side construct the JS-side object by making calls across the Wasm boundary. In some sense this is the dual of the first approach: lots of Rust→JS calls instead of lots of JS→Rust calls. Imagine a Person struct:

struct Person {
    name: String,
    age: u32,
}

Serialization for this makes a series of calls like the following, where my made-up js namespace is just to show all the places where the Rust code is calling up to JS:

let obj = js::create_object();
let name = js::create_string("name");
let name_val = js::create_string(person.name);
js::set_property(obj, name, name_val);
let age = js::create_string("age");
let age_val = js::create_number(person.age);
js::set_property(obj, age, age_val);

This is roughly the approach taken in serde-wasm-bindgen, which is the approach recommended in the Rust Wasm docs. It's conveniently all code generated via serde and in terms of the source-level modifications it only requires a few annotations on some structs.

But looking at the above you might have a few questions. One is why do you need to make a call to JS to create a number object, given that Wasm supports numbers natively? I might be misreading the relevant code here, but I think it's because of the way serialization code is structured, it effectively needs to recursively serialize each struct field to the same type, something like "a handle to a JS object".

The second is that it feels redundant to allocate strings for the names of each field (the quoted "name") above, especially if you're serializing a lot of these objects. serde-wasm-bindgen addresses this by instead interning the names of struct fields into a HashMap, which is nice but also which is one of my least-favorite code patterns, the global cache of data that never shrinks. (It's at least bounded by the total number of distinct field names you ever serialize.)

Finally, all of these calls from Rust to JS are not free. I don't have a good intuition for how fast WebAssembly→JS calls are — for all I know VM engines are able to make them equivalent to within-language calls — but there is still set up goop on the Rust side and additional JS function on the JS side just to glue these two sides together. Just getting a handle to a JS object requires bookkeeping on the Rust side.

With all this in mind, I sketched out a slightly different approach. To serialize the above Person struct, I codegen a JS function like:

function __waser_Person(name, age) { return {name, age}; }

The Rust-side generated code then looks like:

let name_val = js::create_string(person.name);
let age = person.age;
let obj = js::__waser_Person(name_val, age);  // age passed as int

The idea here is that the JS engine is probably best equipped to do all of the relevant caching of the names of the fields, and perhaps we can better hit some optimizations around object construction. It does mean we generate a JS function per type of structure serialized but it's relatively small.

In my hacky prototype running this over some sample data was ~16% faster and also generated smaller code (which I won't quantify because the smaller code was also probably in part due to not using serde) than serde-wasm-bindgen. The benefit seems to be primarily from the function call, not the "passed as int" part, but that surely depends on the specific type of data being serialized, as mine was mostly strings. But also, the performance of this area is not really in any critical path for me. I just found it an interesting exploration, so I am unlikely to land it anywhere!