TQ4HAM: No Taxation on Representation! (LLO Archive)
Created 2025-12-13, last modified 2025-12-13. Visibility: public
Part of my archive of Layover Linux Official posts on Tumblr.
2025-11-15
As you might know from other posts I've made about Prone, or from using Python (where I borrowed the term from), the repr() function is used to make a human-readable text representation of a value. When you see a value printed out in the interactive console:
prn> 1u8 + 5u8
u8(6)
The function that turns a number into the characters u,8,(,6,) is called repr.
At this point in my optimization journey, it was becoming obvious that generating reprs for lists was a taking a significant percentage of CPU cycles. Now, it's true that some tests were using list reprs a lot, essentially for debugging reasons. But the thing is, repr() should be cheap. It's a core part of the language that gets a lot of exercise in everyday situations. If it's slow, that's not a fake or contrived problem, it's a real thing to address.

GIF by nicky9door: Lieutenant Columbo presses his finger to his lips, considering the problem.
So what's slow about list representation? Well you see sir, if you take a look in there... ohhhh it's full of objects being created, used a little, then thrown away again. Terrible thing to see. Real wasteful. It's mostly strings, you know - take for example, representing a list of 4 atoms.
[#a, #b, #c, #d]
If you walk through the code, we make a string that contains a "[", and then a string that says "#a", then we smush 'em together (creating a new string that says "[#a"), which destroys both the first two strings. We keep making a new little piece, then smushing it together with what we have already, and each smush (called concatenation) destroys both the strings you put into it. Which is bad enough as it is, plenty bad, but...

GIF by taramysweetlove: Columbo approaches the camera to say, "Oh there's just one more thing, sir."
... where'd the "#a" come from? And the "#b" and the... well, I hate to rattle on. You understand, I'm sure, being a reader of your stature.
Here's what was buggin' me. A list can contain anything. So for each item in the list, we have to make a repr for that item, and we end up gluing all those reprs together (with some brackets and commas) to get the repr for the list. So even on the best of days, for a list with 100 items, we're making and destroying somewhere over 100 temporary strings.

GIF by pajamasecrets: Columbo says, "My wife is not gonna believe this."
Actually, speaking of. My wife, she writes sometimes. Small things, but oh, I love to read 'em. Little insights into how her mind works. And she was telling me that when she writes, she doesn't usually write each word on a separate blank page, and then use more pages to combine them together. See she's more of a linear writer. She uses one page - just one - because she knows she's always adding on at the end, you know? Never in the middle, always the end.
This had me wondering. When I build myself a repr, no matter how recursive the process might be, going into and out of all the little objects contained in each other... well, you might see where I'm going with this. It's always at the end, where I'm adding on. Like clockwork. So what I really need is one string, and a way to make it bigger by adding to the end of it.
Of course, you know, I'm making my own tools. So I started off with the ability to glue one String onto the end of the other, making the left one bigger. I ended up using this in multiple places, but I got even more mileage by adding one more function, to add raw C strings to the end of Prone String objects. So for things like the square brackets, and the commas, et cetera and et cetera, you wouldn't even make an object in the heap for that.
So at this point, I had something looking like this:
String *list_repr(VVec *list) {
// TODO: implement _wrepr functions that append to an existing string
if (list->vec.len == 0) {
vvec_release(list);
return STR("[]");
}
if (list->vec.len == 1) {
String *repr = string_alloc(32);
repr = string_append_cstr(repr, "[", 1);
repr = string_append(repr, dv_repr(dv_lend(list->vec.buf[0])));
repr = string_append_cstr(repr, "]", 1);
vvec_release(list);
return repr;
}
String *repr = string_alloc(32);
repr = string_append_cstr(repr, "[", 1);
for (size_t i = 0; i < list->vec.len; i++) {
if (i)
repr = string_append_cstr(repr, ", ", 2);
repr = string_append(repr, dv_repr(dv_lend(list->vec.buf[i])));
}
vvec_release(list);
return string_append_cstr(repr, "]", 1);
}
Already a big improvement, I have some special cases, it's beautiful. But I still have to make a new Prone String on the heap for each item in the list. And that was the next thing to solve. I needed a bunch of "write repr" functions, or wrepr, that take a starting string as an input, and then write a little to the end of it. Because then, I can wire the wreprs to each other: they all take a string in progress, and pass it along, taking turns writing to the end of it in an orderly and law-abiding fashion.
Of course, we still need the repr functions, but now they can be simple little wrappers that make an empty string with some up-front space, and pass that as the blank page into the right wrepr function. All the real work happens in this network of wrepr functions, so repr is just an easy on-ramp to that freeway.
String *list_wrepr(String *base, VVec *list) {
if (list->vec.len == 0) {
vvec_release(list);
return string_append_cstr(base, "[]", 2);
}
if (list->vec.len == 1) {
base = string_append_cstr(base, "[", 1);
base = dv_wrepr(base, dv_lend(list->vec.buf[0]));
base = string_append_cstr(base, "]", 1);
vvec_release(list);
return repr;
}
base = string_append_cstr(base, "[", 1);
for (size_t i = 0; i < list->vec.len; i++) {
if (i)
base = string_append_cstr(base, ", ", 2);
base = dv_wrepr(base, dv_lend(list->vec.buf[i]));
}
base = string_append_cstr(base, "]", 1);
vvec_release(list);
return base;
}
String *list_repr(VVec *list) { return list_wrepr(string_alloc(16), list); }
Because of Prone's copy-on-write memory management, this looks like a pattern you'll see a lot around here: modifying values by chucking 'em into a function, and letting it give you a new version back out.
This speedup was enormous, I tell you, simply enormous. Got me a lot closer to that half milli mark, which, you know, I'm very eager to do. But you see, sir, it wasn't quite enough on its own. The most expensive stuff we do is still memory management, whether it's allocating or freeing or lending or releasing. It's always good sense to not use so many objects. And that's when I realized something real special.
This new atom text mechanism from a few chapters back? It could store identifiers and number literals both, just inline in the DV. No construct wrapper around it.

GIF by nickyandmikey: Columbo raising his hands in the air in celebration. Things are lookin' up!
So, you'll see in a few days here, but this kicked off a series of big changes. Call it the "deconstruction" of Prone. I call it the Type Matrix, and as a preview, here's my first draft on a whiteboard.


