Lesson 3.2
Values and Types in Memory
MonkValue: the tagged union that represents every value in Monk. How types exist at runtime.
One variable, many shapes.
In Monk, a variable can hold an integer, a float, a string, a boolean, none, an array, a record, or a function. All of these are different shapes of data with different sizes in memory.
C doesn't work this way. In C, every variable has a fixed type decided at compile time. An int is always an int. A char* is always a pointer to characters. There's no "this could be anything" type.
So how do you represent Monk values in C? You build a container that can hold any of them, and you attach a label that says which one it's currently holding.
The tagged union: a box with a label.
Imagine a box that can hold exactly one item at a time: a book, a mug, or a phone. The box is always the same size (big enough for the largest item). A sticker on the outside tells you what's currently inside.
That's a tagged union. The "tag" is the sticker (an enum saying the type). The "union" is the box (memory that can hold any of the possible values). Together, they form a single structure that represents any Monk value.
MonkValue structure
Only ONE of the data fields is active at a time. The tag tells you which one.
Every Monk value at runtime is a MonkValue. The integer 42, the string "hello", the array [1, 2, 3] -- same struct, different tag.
Reading and writing a tagged union.
To create a MonkValue holding the integer 42, you set the tag to MONK_INT and store 42 in the integer field of the union. In conceptual C:
// Creating a MonkValue for the integer 42
MonkValue v;
v.tag = MONK_INT;
v.data.integer = 42; To read it back, you check the tag first, then access the right field:
// Reading the value back
if (v.tag == MONK_INT) {
int64_t n = v.data.integer; // safe: tag confirms it's an int
}
Reading the wrong field (e.g., reading data.string when the tag is MONK_INT) is undefined behavior in C. The tag is your safety net. Always check it.
In practice, the runtime provides helper functions so you never construct MonkValues by hand. You call something like monk_int(42) which returns a properly tagged MonkValue.
Why a union, not a struct with all fields?
You might wonder: why not just put all fields in one struct? Have an int field, a float field, a string field, and so on. Use whichever one you need.
Memory. A struct allocates space for every field simultaneously. If you have 8 fields, you pay for all 8 even when you're only using one.
A union overlaps all fields in the same memory. It's only as large as its largest member. Since a MonkValue is only one type at a time, a union wastes nothing.
Struct (wasteful)
Total: sum of all fields
Union (efficient)
Total: size of largest field
What lives on the stack vs. the heap.
The MonkValue struct itself always lives on the stack (or inline in an array). It's a fixed size. But some of the data it points to lives on the heap:
Stack (inline in the union)
Integers (int64_t), floats (double), booleans (bool), and none. These are small, fixed-size values that fit directly in the union.
Heap (pointer in the union)
Strings (char*), arrays (MonkArray*), records (MonkRecord*), functions (MonkFunction*). The union holds a pointer; the actual data is allocated on the heap.
This matters because heap-allocated data needs to be freed when it's no longer used. The runtime's monk_free() function checks the tag and, if the value points to heap data, frees it recursively.
The eight value kinds.
Note: MONK_NONE uses no data at all. The tag alone is enough. It's the Monk equivalent of null, but explicit -- you always know when you're dealing with none.
A note on strings.
Monk strings are heap-allocated, null-terminated, UTF-8 encoded. The char* pointer in the union points to a dynamically allocated buffer on the heap.
One consequence: getting the length of a string counts UTF-8 codepoints, not bytes. The string "cafe" is 5 bytes in UTF-8 but length("cafe") returns 4. This is an O(n) operation -- the runtime walks the string to count codepoints.
Key takeaways
MonkValue is a tagged union: a type tag plus a union of possible data fields. Only one field is active at a time.
Eight value kinds: int, float, string, bool, none, array, record, function.
Small values (int, float, bool) live inline in the union. Variable-size values (string, array, record) are heap-allocated, with a pointer in the union.
A union is more memory-efficient than a struct because all fields overlap. You only pay for the largest one.