After this Winter Semester I can now finally count C as one of the languages I am comfortable in.
I learned programming C in a course that could loosely translate to “programming close to system” or “machine oriented programming”.
We devised a virtual machine for the ninja programming language that was designed from the ground up as a stack machine. A stack machine is a easy start as you can quickly design the stack and binary operations that consume operands from the stack and produce one new result to be pushed on the stack.
We started with a purely integer based stack, because we wanted the encoding of instructions to be conceptualized before we added an object based stack.
The encoding of instructions went like so:
You have a 32-bit unsigned integer. The first 24 Bits handled the immediate value. For example for push 12 the immediate value is 12. The leading 8 bits were used to encode the operations. To get the opcode from the integer you need to shift the bits a little, like so:
(encoded_int) & 0xFF000000) >> 24
This essentially first cuts of all bits except the leading 8 bits and then shifts it all to the right so only the leading bits remain without any trailing zeroes.
The instruction to isolate the immediate value is a bit easier as we can ignore the shifting. We just need to mask it.
(encoded_int) & 0x00FFFFFF
But what if we ever wanted to pass negative values as an immediate value? This is not possible with the current design per se, but we can use a very smart trick:
(immediate) & 0x00800000 ? (immediate) | 0xFF000000 : (immediate)
By checking if the leftmost bit of our immediate range is set we can either return the instruction where the opcode has been replaced by FF to sign it as negative, or we can return the immediate as it is when it doesn’t need to be signed.
This way both instruction and value were neatly packed into unsigned integers. Now we could have an array of integers that make up the whole program. By iterating through this array and executing one instruction at a time you can essentially run the program.
Next up we implemented memory allocation along with real objects and handling their pointers on the stack. We implemented global and static variables. We took care of local variables and stack frames for calling procedures or jump instructions. We implemented handling for control structures such as if, if-else, while- and do-loops.
After all this groundwork was done we got to a really exciting point: Garbage collection. From previous experience with Java (which does GC automatically) it sounded like Garbage Collection was a relatively easy concept. It wasn’t.
We decided to implement the Stop & Copy (Minsky, Fenichel, Yochelson, 1969) garbage collection.
I won’t talk about it here as it would blow up this post but I really recommend checking it out.