Re-rewriting JS Mate Poe in Rust/Wasm

Javascript is an objectively terrible foundation for a project of any complexity, but back in 2019 when I first sat down to write JS Mate Poe, it was the best, if only, tool for the job.

But since then, WebAssembly (Wasm), and the ecosystem built around it, has matured to the point where it is almost capable of doing everything a library like this requires.

I'd prefer almost, of course, but that would take forever1, and so last month, I finally decided to re-rewrite my favorite pet project from scratch.

I'm extremely happy with the result, but because of that pesky "almost", it required a number of — hopefully temporary — workarounds and concessions, resulting in a library that is mostly Wasm, but also slightly Javascript too.

As Wasm is often touted as a "magic bullet", I thought it might be interesting to examine the shortcomings I ran up against in more detail. But before we can dive into anything functional, we should talk about one glaring problem affecting the very usage of Wasm.

Glue is Sticky.

Every modern browser will happily run a Wasm application, but none of them can actually load one, at least not directly.

For example, this does not work:

<script type="application/wasm" src="js-mate-poe.wasm"></script>

Instead, Wasm applications must be loaded and initialized with convoluted Javascript bootstrap code called "glue".

<script>
    fetch('js-mate-poe.wasm').then((r) => {
        // Lots of initialization stuff happens here.
    });
</script>

The glue for JS Mate Poe comes out to a whopping 9 KiB after minification, so I won't subject you to it here, but suffice it to say, that kind of bloat is less than ideal.

Regardless of the size, however, having to ship two files instead of one makes it significantly more difficult for end users to actually install and use libraries like JS Mate Poe.

So I cheated.

I embedded the Wasm inside its glue as one gigantic base64-encoded string. This adds even more unnecessary bloat2 to the compiled script, but at least simplifies its distribution.

Maybe some day it will be possible to embed the glue inside of the Wasm instead, but for now, it is what it is.

And, glue or not, I ran into a handful of performance- and memory-related issues that ultimatedly required Javascript anyway. One way or another, this library would have had to straddle the two worlds.

God DOM It!

Wasm binaries are heavily optimized at compile-time, and as such, the internal happenings of JS Mate Poe — keeping track of what each "mate" should be doing, where, and when — leaves the old version in the dust.

But it also has to manifest its state visually 10-60 times per second, and working with the DOM in Wasm is, frankly, not that great.

Lacking any direct access to the browser's APIs, Wasm must instead leverage Javascript for any DOM-related business, meaning, at best, it will always be at least a little bit slower than Javascript in this area.

And because each and every Javascript method requires its own import declaration, well, it's easy to understand how JS Mate Poe's glue ballooned to 9 KiB. Haha.

To illustrate the point, let's think about what it might take just to keep a Wasm library apprised of the current screen size.

In Javascript, the task is relatively straightforward:

let width = 0;
let height = 0;

const setSize = function() {
	width = document.documentElement.clientWidth;
	height = document.documentElement.clientHeight;
};

setSize(); // Call it once to set the initial value.

// And call it again anytime the window resizes.
window.addEventListener('resize', setSize, { passive: true });

Wasm can do the same thing, but would need to import (at least):

So yeah, it's a lot, but I digress.

The actual performance differences that result range anywhere from negligible to terrible, but even in the worst cases, they might not actually be noticeable depending on the context.

All of JS Mate Poe's one-time setup and breakdown operations — adding/removing a few elements and event listeners — are handled solely on the Wasm side. They're technically a little slower because of it, but only slightly, and I can't see any difference with my human eyeballs, so it's fine.

I did, however, encounter some intermittent jitteriness with the actual animations. And since JS Mate Poe's primary purpose is to indefinitely animate a small sheep on the screen, that was rather disappointing.

But it wasn't just — or even mostly — the API indirection that was to blame, but rather the overhead of getting data to that API in the first place.

Common Data Types.

Despite the name, WebAssembly isn't a web-only technology. It is designed to be as portable as possible, allowing it to interoperate with more or less anything, anywhere.

And because of this, its APIs favor lowest common denominators.

For example, all data entering or leaving the Wasm must be expressed as a 32-bit or 64-bit integer, signed or unsigned. Nothing but those four primitive types are supported!

Smaller types can be recast easily enough, but for complex structures like strings, the situation is more complicated. The data must first be deconstructed, copied wholesale over to a shared memory buffer, and sent across the boundary as raw pointers and lengths — i.e. integers — and then the other side has to do the same thing in reverse.

Needless to say, it is best to avoid such overhead whenever possible.

The DOM, however, uses strings for damn near everything. And that, in a nutshell, was my problem.

JS Mate Poe's animation magic is relatively simple, leveraging only a handful of CSS classes and custom properties. It keeps track of the tick-to-tick differences internally and selectively updates the corresponding DOM properties that need updating during requestAnimationFrame cycles.

For the most part, that was enough to ensure smooth playback, even with all of Wasm's overhead, but some ticks require more changes than others. At the extreme, the library might have to pass as many as a dozen strings across the boundary to animate a single "mate", and there are two of them.

And occasionally, the overhead was just too much.

As a first step, I tried setting className en masse rather than toggling the classes individually. That halved the number of strings being shared, but slowed down the browser's paint performance, so wasn't actually helpful.

Next, I begrudgingly set up a custom Javascript import method to handle the actual setting of the CSS properties, and updated Rust to use that instead of trying to handle it directly:

const writeCssProperty = function(el, key, value) {
	el.style.setProperty(`--${key}`, `${value}px`);
};

That one-liner did the trick!

It's subtle, but by having Javascript handle the "--" key prefix and "px" value suffix, both could be passed over the boundary as not-strings, char and i32 respectively. (My CSS properties have single-letter names, like --x, --y, etc.)

This had the added benefit of allowing me to remove a lot of complexity from the Rust code, as I no longer needed to manually stringify the pixel values (to avoid dynamic allocation3) ahead of time.

But having done that, I couldn't really justify keeping any of the class-setting code in Rust either, especially since className was the wrong way to go about it.

And so that, too, became a simple pass-through Javascript import method — with boolean arguments representing each class state — and let me further reduce the Rust-side complexity.

I wasn't thrilled about having to split the codebase, but as ugly workarounds go, these didn't actually look too bad.

Unfortunately, there was a little more shifting to be done…

A Leaky Ship!

Update 2023-05-11: The issues described in this section have since fixed themselves, allowing me to move the functionality back to Wasm's domain. It may still be worth a read, though, as Javascript does it better, so depending on the context, such workarounds may still be warranted.

It's easy to imagine how all this boundary-crossing and data-sharing might lead to memory leaks or other complications, but for the most part, that actually didn't happen!

Javascript has its own garbage collection process and, because its API is the one being used for all of Wasm's DOM-related operations, it works the way it always does.

Objects are kept in memory so long as an active reference to them exists — whether in JS or Wasm — and, if/when that changes, they're marked for potential cleanup.

For the most part, that is…

It took me a while to track down the cause, but the memory usage of the Wasm version of JS Mate Poe was about one megabyte higher than it should have been.

For some reason, the binary-to-Uint8Array-to-Blob-to-URL.createObjectURL chains — used to generate "URLs" for the embedded media assets — was leaving multiple, superfluous copies of the file data in memory.

As with the performance-related issues, this ultimately required shifting some code from Rust to Javascript, but with the workflow reversed.

Because Wasm couldn't be trusted with any of the URL creation/handling business, Rust was relegated to a storage locker of sorts. All it needed to do was store the raw file contents in static arrays, and export methods to share their raw pointers and lengths with Javascript:

/// # The raw PNG data.
static IMG: [u8; 26_789] = […];

#[wasm_bindgen]
/// # Image Data Pointer.
pub fn img_ptr() -> *const u8 { IMG.as_ptr() }

#[wasm_bindgen]
/// # Image Data Length.
pub fn img_len() -> u32 { 26_789 }

Javascript, for its part, basically wound up doing what exactly what Rust had done previously:

let imgUrl = null;
export const poeInitImage = function(wasm) {
	imgUrl = URL.createObjectURL(
		new Blob(
			[new Uint8ClampedArray(
				wasm.memory.buffer,
				wasm.img_ptr(),
				wasm.img_len(),
			)],
			{ type: 'image/png' },
		)
	);
};

Of course, with the URLs now living in Javascript, it no longer made sense to set the <img> and <audio> element sources in Rust — that would require string sharing! — so Javascript took over those duties too.

So much for a unified codebase!

Mule Variations.

Pretty much every article about Wasm reaches some variation of the same conclusion:

Wasm for some things.
Javascript for other things.
Neither are perfect.

And despite my best, stubborn efforts, that's exactly what I found with the re-rewrite of JS Mate Poe. My concessions, though small and targeted, ultimately resulted in a Wasm/Javascript hybrid.

But that may not always be the case.

As Wasm continues to develop — and its browser support conintues to improve — most or all of these issues should become moot. (Some of them already have!)

Given enough time, a Wasm-only Poe should be totally achievable.

At least, if something could be done about that damn glue

Poe Everywhere?

By the way, if you enjoyed having Poe run around your screen while reading this article, the library is also available as a Firefox extension.

Constant companionship is just a click away!

---

1. Wasm is one of those projects that wants to solve ALL THE PROBLEMS, ALL AT ONCE, but with so many interests pulling it in different directions, its pace of development is tortuous at best. Nobody currently living will see its true potential realized.

2. Base64 encoding uses four bytes of text-friendly ASCII to represent three bytes of raw binary data, resulting in an overhead of exactly one-third (or a bit more if padding and/or line breaks are added).

3. Dynamic allocation requires special memory-handling logic that can dramatically increase the file size of an otherwise small binary.

Josh Stoik
5 April 2023
Previous Optimizing FLAC Audio Files for Production Hosting
Next Disabling Snap, and Keeping It Disabled