A register-based, garbage-collected, stackless, lightweight Virtual Machine for object-oriented programming languages.
Ships with a naive implementation of a reference-counting Garbage Collector, although it will implement a more advanced GC algorithm, probably Baker's treadmill.
Eventually the VM will be optimized to be fast on ARM processors, but for now it compiles to both ARM (tested on Android) and x86 architectures. It will also use LLVM for JIT-compiling code to native at runtime.
Anyway, it's a work in progress! :)
$ git clone git://github.com/txus/terrorvm.git
$ cd terrorvm
$ make
To run the tests:
$ make dev
And to clean the mess:
$ make clean
TerrorVM runs .tvm
bytecode files such as the hello_world.tvm
under the
examples
directory.
$ ./bin/vm examples/hello_world.tvm
It ships with a simple compiler written in Ruby (Rubinius) that compiles a
tiny subset of Ruby to .tvm
files. Check out the compiler
directory, which
has its own Readme, and the compiler/examples
where we have the
hello_world.rb
file used to produce the hello_world.tvm
.
TerrorVM doesn't need Ruby to run; even the example compiler is a proof of concept and could be written in any language (even in C obviously).
TerrorVM is designed to run dynamic languages. You can easily implement a compiler of your own that compiles your favorite dynamic language down to TVM bytecode.
I've written a demo compiler in Ruby under the compiler/
folder, just to
show how easy it is to write your own. This demo compiler compiles a subset of
Ruby down to TerrorVM bytecode, so you can easily peek at the source code or
just copy and modify it.
You can write your compiler in whatever language you prefer, of course.
TerrorVM files are encoded with a header containing _main
(the method
that will be the entry point), some info about number of registers, local
variables, literals and instructions used by the method, followed by all the
literals, and then all the instructions. It starts like this:
_main
Then info encoded in the format
:num_registers:num_locals:num_literals:num_instructions
:
:10:2:4:17
Then all the literals, each one in a line (the ones starting with "
are
string literals):
123
"print
"Goodbye world!
"Hello world!
And then all the instructions:
0x2000000
0x51000000
0x9010000
0x51010100
...
Instructions have a compact 3-operand representation, 8-bit each, for a total of 32-bit per instruction.
TerrorVM exposes a VM
object that responds to primitive
, which returns a
hash with some VM primitive functions exposed as Terror Function objects.
A simple example of those are arithmetic functions (+
, -
, *
, /
) used
by Integer objects, for example. To use this in your functions, do it like
this:
VM.primitive[:+].apply(3, 4) # this is the same as 3 + 4 or 3.+(4)
- Hello world (Ruby code, TVM bytecode)
- Numbers (Ruby code, TVM code)
- Objects with prototypal inheritance (Ruby code, TVM bytecode)
- Exposed VM primitives (Ruby code, TVM bytecode)
- NOOP: no operation -- does nothing.
- MOVE A, B: copies the contents in register
B
to registerA
. - LOADI A, B: loads the integer (from the literal pool) at index
B
into registerA
. - LOADS A, B: loads the string (from the literal pool) at index
B
into registerA
. - LOADNIL A: loads the special value
nil
into registerA
. - LOADBOOL A, B: loads a boolean into register
A
, being true ifB
is the number 1 or false ifB
is 0. - LOADSELF A: loads the current
self
into registerA
.
- JMP A: unconditionally jumps
A
instructions. - JIF A, B: jumps
A
instructions if the contents of the registerB
are eitherfalse
ornil
. - JIT A, B: jumps
A
instructions if the contents of the registerB
are neitherfalse
nornil
.
- LOADLOCAL A, B: loads the value in the locals table at index
B
into the registerA
. - SETLOCAL A, B: stores the contents of the register
A
in the locals table at indexB
.
- LOADSLOT A, B, C: loads the slot named
C
from the objectB
to the registerA
. - SETSLOT A, B, C: sets the slot named
B
from the objectA
to the value in the registerC
.
- MAKEARRAY A, B, C: Takes
C
registers starting from registerB
, creates an array with them and stores it in registerA
.
- SEND A, B, C: send a message specified by the string in the literals
table at index
B
to the receiverA
, with N arguments depending on the arity of the method, the first argument being in the registerC
. - RET A: return from the current call frame to the caller with the value
in the register
A
.
- DUMP: Print the contents of all the registers to the standard output.
This was made by Josep M. Bach (Txus) under the MIT license. I'm @txustice on twitter (where you should probably follow me!).