Skip to content

RabbitVM/rabbit

Repository files navigation

Rabbit VM RFC

Why?

I would like to make a RISC architecture that is capable of comfortably sitting on top of nearly any other architecture. If it’s possible to compile to Rabbit, then a program can run on more or less any hardware or virtualized architecture.

Rabbit can be optimized per-architecture, while maintaining the same interface. It may, for example, take advantage of Intel’s SIMD behind the scenes.

What?

Definitions

space: A register or memory location.

Registers

Registers are 32 bits wide.

ValueRegisterUse
0x0zeroContains 0. MIPS style.
0x1 .. 0x9r1 .. r9General purpose.
0xAipInstruction pointer.
0xBspStack pointer.
0xCretReturned value.
0xDtmpTemporary register.
0xEflagsFlags used for comparison.

<<flags_section>>

Flags

BitFlagMeaning
0x0SFSign flag. On if sign bit of result is on.
0x1ZFZero flag. On if result is zero or numbers were the same.
0x2 .. 0x20Reserved.

Instruction set

Real instructions

When it makes sense, the destination register is the first argument to an instruction. The last argument to the following instructions may also be an immediate value, denoted with a prefix of $: move, add, sub, mul, div, shr, shl, nand, xor, br, brz, brnz.

ValueInstructionUsageExplanationDescription
0x0halthaltStop the execution of the machine immediately.
0x1movemove %rB, %rCr[B] := r[C]Move one space into another.
0x2addadd %rA, %rB, %rCr[A] := r[B] + r[C]Add two spaces into a third.
0x3subsub %rA, %rB, %rCr[A] := r[B] - r[C]Subtract two spaces into a third. Sets flags.
0x4mulmul %rA, %rB, %rCr[A] := r[B] * r[C]Multiply two spaces into a third.
0x5divdiv %rA, %rB, %rCr[A] := r[B] / r[C]Divide two spaces into a third.
0x6shrshr %rA, %rB, %rCr[A] := r[B] >> r[C]Shift right one space a number of times.
0x7shlshl %rA, %rB, %rCr[A] := r[B] << r[C]Shift left one space a number of times.
0x8nandnand %rA, %rB, %rCr[A] := not(r[B] & r[C])NAND two spaces.
0x9xorxor %rA, %rB, %rCr[A] := r[B] ^ r[C]XOR two spaces.
0xAbrbr %rCgoto r[C]Branch.
0xBbrzbrz %rCif (ZF set) goto r[C]Branch if ZF is set.
0xCbrnzbrnz %rCif (!(ZF set)) goto r[C]Branch if ZF is not set.
0xDinin %rCr[C] := getchar()Read one character from stdin into a space.
0xEoutout %rCputchar(r[C])Print one character from a space to stdout.

Assembler macros

The last argument to the following macros may also be an immediate value, denoted with a prefix of $: cmp, not, push, call.

MacroUsageExpansion
cmpcmp A, Bsub %tmp, A, B
notnot A, Bnand A, B, B
oror A, B, C(A nand A) nand (B nand B)
andand A, B, Cnand A, B, C // not A, A
pushpush Amove (%sp), A // sub %sp, %sp, $1
poppop Aadd %sp, %sp, $1 // move A, (%sp)
callcall Apush %ip // br A
retretpop %ip

Addressing modes

There are two addressing modes: %reg and (%reg). The former uses the value in the register, and the latter uses the word at the address in the register.

How?

Instruction formats

instr %rA, %rB, %rC
instr %rA, %rB
instr %rA
     +-----Immediate bit
     |+----Addressing mode bit C
     ||+---Addressing mode bit B
     |||+--Addressing mode bit A
     |||| +Dead space+     regB
     vvvv vvvvvvvvvvvv     vvvv
IIII MDDD 000000000000 CCCCBBBBAAAA
^^^^                   ^^^^    ^^^^
Opcode                 regC    regA

VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Immediate value

Every bit in “Dead space” must be turned off. If one is turned on, the result is undefined.

If the immediate bit is on, then the instruction disregards rC and instead looks for its third argument in the 32 bits after the first instruction. For example:

1: 0001 1 000 000000000000 0001 0000 0000
2: 0000 0 000 000000000000 0000 0000 0111

represents a move instruction with the immediate bit set. It will therefore look for an immediate value in the following word (in this case, the value is 7), and then store it in r1.

Addition works in a similar fashion:

1: 0010 1 000 000000000000 0001 0001 0000
2: 0000 0 000 000000000000 0000 0000 0001

represents an add instruction with the immediate bit set. It looks for an immediate value in the following word (in this case, 1), adds it to the value in r1, then stores the result in r1. So this instruction would be an increment instruction.

The addressing mode bits are simple; if a register’s addressing mode bit is on, then the address in the register is dereferenced when the instruction is being executed, and that data is used instead. For example:

1: 0010 0 100 000000000000 0111 0001 0000

Performs an addition operation that adds the contents of zero with r1 and stores the result in memory at the address in r7.

Stages of compilation

Preprocessing

The preprocessor will be responsible for macro expansion and label to address translation. Macros exist in the form of instruction expansions, done behind the scenes.

Peephole optimization

Assembling

Floating point

Floating point computation is left to the client (an exercise for the reader, if you will).

Memory layout

The memory layout is completely flat right now.