ROSE-8 in customasm
Last week a friend shared the existence of hlorenzi’s customasm, a tool that can serve as the assembler for arbitrary CPU architectures just by defining a mapping of instructions to encodings.
Hey, I made a CPU once! How hard would it be to make a customasm definition for ROSE-8? Turns out…not very! I played around with it for about two hours, and by the end of it I’d translated an entire ROSE-8 program to customasm, with most of the definition file looking basically the same as the text reference for the ISA encoding.
Okay, so programs written by hand in assembly aren’t that long. But still, how long did it take me to write the original assembler? Probably more than two hours! (The commits are spaced over a few days, at least.) With customasm I could get something binary-identical to the original program output, and meanwhile I got expression syntax, custom functions, and everything else for free.
I spent some more time yesterday to prettify the definition file and make it more convenient (and played around some more with bigger ROSE-8 programs). I considered committing some of those bigger programs, but they didn’t really show anything interesting—the core instructions were all the same, only the syntax had changed a little! In a mostly automated way, even! (read: regex find/replace to do 90% of the translation work). The trickiest bit was actually that I had previously relied on literals being emitted little-endian and customasm defaults to big-endian—easy enough to fix once I realized it.
I’m not really planning to be writing more ROSE-8 programs, and I did have an assembler already. But while there were a few things I was missing, overall I was just very taken with the benefits of having all this infrastructure just available. And I have a sense that this isn’t just a good format for code, but really many binary outputs you might want to write by hand, like protoscope for protobuf-like formats. Not a common thing, but when you need it it’s invaluable.1
P.S. If you want an instruction that does not emit anything, you can use a zero-width literal, like 0`0
.
-
Unfortunately, one thing customasm doesn’t do well is length-prefixing, which is needed for both protobuf and IFF-like formats like mp4. You can manually do it by measuring distance to an end label, but that’s kind of annoying, especially for protobuf where the length to be emitted is itself variable-length. But hey, it’s an open-source project, if I really needed it I could implement it myself. ↩︎