Skip to content

aemoncannon/rubinius

 
 

Repository files navigation

Rubinius Trace Compiler Experiment
----------------------------------

This fork of Rubinius hosts my (personal) experiments with adding trace compilation 
to Rubinius. I started working on on this shortly after Rubinius 1.0, as
part of compilers course. 


The approach is based on tracing literature I've read. Largely:

Dynamo: a transparent dynamic optimization system 
http://portal.acm.org/citation.cfm?id=349299.349303

Trace-based just-in-time type specialization for dynamic languages
http://portal.acm.org/citation.cfm?id=1542476.1542528

Nested Trace Trees
http://ssllab.org/hotpath/files/NestedTT.pdf



The interpreter is instrumented to count backward jumps. After a certain threshold the 
interpreter begins recording a trace. A trace is a linked sequence of ops, annotated 
with contextual information. Trace recording can cross method call and block yield 
boundaries, and does not stop until A) the recording returns to the inferred loop header,
or B) the trace gets too long (not yet implemented).

Traces are compiled using a modified version of the existing Rubinius JITing code. The
inline method, inline block emitters were particularly useful.

When control exits a trace unexpectedly, an exit stub is executed which invokes the 
uncommon interpreter. Uncommon is slightly modified so that it will continue executing
until all call_frames higher than the trace's 'home callframe' are popped. Then control
is handed back to the normal interpreter.

Traces can be nested so long as the nestee is well behaved during recording.
The nestee must not exit unexpectedly, and must conclude on its home callframe.


Caveats:

* I'm not currently handling trace exits due to mispredicted dispatch
  (all my benchmarks have fixed dispatch targets).

* This work is branched from a 3-4 month old version of Rubinius (1.0)

* I turned off the GC.

* The traceable subset of ruby is pretty small right now. See benchmarks below for examples
  of what works.



Benchmarks: 
WARNING: These benchmarks are old and unfair. However, they do give a general flavor of where
the trace compilation is advantageous.




# loop.rb
ITERATIONS = 1_000_000

def run()
  i = 0
  while i < ITERATIONS
    i += 1
  end
end

time do
  run()
end


Trace JIT:
0.03585 seconds.

Method JIT:
0.06085 seconds.

No JIT:
0.05253 seconds.

MRI:
0.3680 seconds.


The simplest case and the first thing I implemented. The more iterations are run,
the better trace looks. Method JIT doesn't do well, I assume this is because it 
(at least as of 1.0) didn't do hot stack frame replacement.

-----------------------------------------------------------------






# nest_simple.rb
ITERATIONS = 100_000

def run()
  j = 0
  i = 0

  while i < ITERATIONS
    i += 1
    k = 0
    while k < 1000
      j += 1
      k += 1
    end
  end

end

time do
  run()
end

Trace JIT:
1.3645 seconds.

Method JIT:
7.0563 seconds.

No JIT:
7.5908 seconds.

MRI:
49.3839 seconds.


Simple nested trace example. Note that the outer trace is actually nested into
the inner trace, since inner gets hot first. Same story with Method JIT, not
really getting used.

-------------------------------------------------------------





# nest_funcs.rb
ITERATIONS = 100_000

def foo()
  k = 0
  while k < 1000
    k += 1
  end
  k
end

def run()
  j = 0
  i = 0
  while i < ITERATIONS
    i += 1
    j += foo()
  end
end


time do
  run()
end

Trace JIT:
1.10758 seconds.

Method JIT:
1.01100 seconds.

No JIT:
4.6909 seconds.

MRI:
31.5706260204315 seconds.



Nesting traces across call boundaries. See how the Method JIT kicks in.

-------------------------------------------------------





# nest_funcs_and_exit.rb
ITERATIONS = 100_000

def foo(i)
  if i < ITERATIONS * 0.5
    2
  else
    1
  end
end

def calc(i)
  k = 0
  while k < 100
    k += foo(i)
  end
  k
end


def run()
  j = 0
  i = 0
  while i < ITERATIONS
    i += 1
    j += calc(i)
  end
end


time do
  run()
end

Trace JIT:
4.58751 seconds.

Method JIT:
1.89338 seconds.

No JIT:
2.14444 seconds.

MRI:
16.85354 seconds.


Miss prediction half the time. Tracing performance takes a dive.

----------------------------------------------------------



# if_exit_with_yield.rb
ITERATIONS = 100_000

def foo(i)
  k = 0
  if i < ITERATIONS/2
    k = 2
  else
    k = -3
  end
  k
end

def calc(i)
  k = 0
  1.times do
    k += foo(i)
  end
  k
end

def run()
  j = 0
  i = 0
  while i < ITERATIONS
    j += calc(i)
    i += 1
  end
  puts "Result: #{j}"
end


time do
  run()
end

Trace JIT:
0.1841919 seconds.

Method JIT:
0.2230579 seconds.

No JIT:
0.112933 seconds.

MRI:
0.37597 seconds.

Not totally sure why No JIT wins here. Might be a compile time things.

---------------------------------------------------------------





# bm_fact.rb
ITERATIONS = 1_000

def fact(n)
  if (n > 1)
    n * fact(n-1)
  else
    1
  end
end

time do
  fact(ITERATIONS)
end

Trace JIT:
0.0028970 seconds.

Method JIT:
0.0021700 seconds.

No JIT:
0.0019800 seconds.

MRI:
0.005550 seconds.


No trace happens, since we never really loop. Need to add special
handling for this - but there may be a neat nested trace solution.

-------------------------------------------




# bm_nbody.rb
# ITERATIONS = 1_000
# See bm_nbody.rb in ./bench

Trace JIT:
0.215943 seconds.

Method JIT:
0.107136 seconds.

No JIT:
0.08390 seconds.

MRI:
0.104515 seconds.


Suspect there's lots of exits here. Haven't really dug into it.
But it runs! :)















ORIGINAL README
------------------

1. What is Rubinius

Rubinius is an execution environment for the Ruby programming language.  It is
comprised of three major pieces: a compiler, a 'kernel' (otherwise known as
the Ruby Core Library), and a virtual machine. The project's goal is to create
a top-of-the-line Ruby implementation.

2. License

Rubinius uses the BSD license. See LICENSE for details.

3. Running Rubinius

See doc/getting_started.txt.

3.1. For the impatient

Now to configure with LLVM: "./configure --enable-llvm"

This will try to download a prebuilt version of llvm for your system. If it
can't find a prebuilt version, then it will at the very least checkout LLVM
from svn and built it during the next step (this takes a lot of time).

  or

To configure without the JIT: "./configure"

Now: "rake"

4. Status

Rubinius is under heavy development and currently supports the core Ruby
classes and kernel methods. The majority of the existing Ruby libraries should
run without modification.  If your MRI 1.8.6-compatible code does not run
under Rubinius, please open a bug ticket. See doc/howto/write_a_ticket.txt.

As Rubinius becomes more and more compatible with Ruby 1.8, the development
effort is shifting toward performance, rather than completeness.

5. Goals

* Thread safety. Rubinius intends to be thread-safe so you could embed more
  than one interpreter in a single application.

* Clean, readable code that is easy for users to understand and extend.

* Reliable, rock-solid code. Valgrind is used to help verify correctness.

* Bring modern techniques to the Ruby runtime. Pluggable garbage collectors and
  code optimizers are possible examples.

6. Tickets

See doc/howto/write_a_ticket.txt

7. Contributing

The Rubinius team welcomes contributions, bug reports, test cases, and
monetary support. One possible way to help is implement Ruby library classes.
See doc/contributing.txt to get started.

8. Architecture

While most of the Rubinius features are implemented in Ruby, the VM itself is
written in C++. This is likely to continue to be the case in the coming
months, partly to ease the integration of LLVM into the Rubinius system.

The compiler, assembler, and bytecode generators are all written in Ruby, and
can be found under the ./kernel/compiler directory.

Packages

No packages published

Languages

  • Ruby 57.8%
  • C 23.1%
  • C++ 10.8%
  • Shell 4.0%
  • JavaScript 3.8%
  • Perl 0.3%
  • Other 0.2%