November 10th, 2004


Ruby2C interpreter

I've been working on building an interpreter for Ruby2C. I want
to keep support for the following operators:

  • True, false, integer and nil literals
  • Addition, subtraction and multiplication
  • And, or and not logical operations
  • Nil?, equal? less-than and less-than or equal comparisons
  • Local variable assignment and reference
  • Function "definition", call and return
  • If for branching
  • While for looping
  • Memory allocation and referencing

Currently the interpreter also supports in addition to the operators
listed above:

  • String literals
  • String interpolation
  • Global variables

Current Thoughts

My major milestone target for this interpreter is building an
interpreter than we can run through ruby2c and then have it interpret

Strings add extra complication to that I'd rather not have to
worry about in the first pass. String interpolation is
simply unnecessary.

I certainly don't want global variables in the interpreter
itself, since they make it nearly impossible to build the
interpreter to be re-entrant.

String functionality could be built on top of the current
milestone by turning string literals into lists of ASCII
characters and then writing functions in the subset that can
manipulate lists. I'm not completely sold on adding them to the
supported operators.

Function "definition", is a not quite accurate description of how
they will be implemented. A function definition will be a label
that gets jumped too after some setup is done on the stack. A
function call does that setup and jumps, and a function return
undoes the setup, puts its return value in the right place, and
jumps back.

I'm not quite certain how useful while becomes once jumping to
functions is implemented, other than for prettiness of the code.

If will need to jump oven the then block when the condition is

Memory Allocation

Memory allocation is currently being done just like lisp/scheme's
cons. A pair is generated where the first value is the data, and
the second value is a pointer to another pair or nil.

For the first round I think I'll just allocate a block of space and
be done with it, crashing when memory is exhausted. C's malloc/free
and garbage collection can be added later.

nil, true, and false will be represented by fixed even pointers,
while integers are represented as odd pointers to 2n+1.

This isn't very efficient, but it will suffice for a major


In order to be capable of bootstrapping itself, the interpreter
will need to write some kind of bytecode that can be pre-loaded
into memory.

To help this goal, the call and defn sexps were rewritten to include
the length of the arguments they use.

Currently the interpreter stores its local variables in an
Environment class, which is an Array of Hashes. The current
layer of the Environment can be stripped off and used to allocate
space for all the locals that could be used in the function body.

Once all temporary space is pre-allocated and accounted for,
local variables will turn into offsets on the stack.


I've run out of thoughts on this for now.