Ruby Under a Microscope

How JRuby executes your code

As I explained in Chapter 1, JRuby tokenizes and parses your Ruby code in almost the same way that MRI Ruby does. And, like Ruby 1.9 and Ruby 2.0, JRuby continues to compile your Ruby code into byte code instructions before actually running your program using a virtual machine.

However, this is where the similarity ends: MRI and JRuby use two very different virtual machines to execute your code. As I showed earlier in Chapter 2, MRI Ruby 1.9 and higher use YARV, which was custom designed to run Ruby programs. JRuby, however, uses the Java Virtual Machine to execute your Ruby program. Despite it’s name, many different programming languages run on the JVM. In fact, this really is JRuby’s raison d'être - the whole point of building a Ruby interpreter with Java is to be able to execute Ruby programs using the JVM. There are two important reasons to do this:

Environmental: Using the JVM opens new doors for Ruby and allows you to use Ruby on servers, in applications and in IT organizations where previously you could not run Ruby at all.
Technical: The JVM is the product of almost 20 years of intense research and development. It contains sophisticated solutions for many difficult computer science problems such as garbage collection, multithreading, and much more. By running on the JVM, Ruby runs faster and more reliably!

To get a better sense of how this works, let’s take a look at how JRuby would execute the same one line Ruby script I used as an example earlier:

puts 2+2

The first thing JRuby does is tokenize and parse this Ruby code into an AST node structure. Once this is finished, JRuby will iterate through the AST nodes and convert your Ruby into Java byte code. Using the bytecode command line option you can actually see this byte code for yourself:

$ cat simple.rb
puts 2+2
$ jruby --bytecode simple.rb

The output is complex and confusing and I don’t have the space to explain it here, but here’s a diagram summarizing how JRuby compiles and executes this one line program:

Here’s how this works:

On the top left I show the “puts 2+2” Ruby source code from simple.rb.
The downward arrow indicates that JRuby translates this into a Java class, named “simple” after my Ruby file name, and derived from the AbstractScript base class.
The JVM later calls the second method in this class, __file__, in order to execute my compiled Ruby script. The __file__ method contains the compiled version of the top level Ruby code in simple.rb - in this example the entire program.
The __file__ method, in turn, calls the op_plus method in the RubyFixnum Java class.
Once JRuby’s RubyFixnum Java class has added 2+2 for me and returned 4, __file__ will call the puts method in the RubyIO Java class to display the result.

There are a couple of important ideas to notice in all of this: First, as I said above, your Ruby code is compiled into Java byte code. It’s both alarming and amazing at the same time to imagine one of my Ruby programs converted into Java! However, remember we’re talking about Java byte code here, not an actual Java program. Java byte code instructions are very low level in nature and can be used to represent code originally written in any language, not just Java.

Second, JRuby implements all of the built in Ruby classes such as Fixnum and IO using Java classes; these classes are named RubyFixnum, RubyIO, etc. Of course, JRuby also implements all of the Ruby language’s intrinsic behavior as a series of other Java classes, including: objects, modules, blocks, lambdas, etc. I’ll touch on a few of these implementations in the following chapters.

Internally, the JVM uses a stack to save arguments, return values and local variables just like YARV does. However, explaining how the JVM works is beyond the scope of this book.

To get a feel for what the JRuby source code looks like, let’s take a quick look at the op_plus method in the org.jruby.RubyFixnum Java class:

public IRubyObject op_plus(ThreadContext context,
                           IRubyObject other) {
  if (other instanceof RubyFixnum) {
    return addFixnum(context, (RubyFixnum)other);
  }
  return addOther(context, other);
}

First of all, remember this is a method of the RubyFixnum Java class, which represents the Ruby Fixnum class, the receiver of the op_plus operation. Thinking about this for a moment, this means that each instance of a Ruby object, such as the Fixnum receiver “2” in my example, is represented by an instance of a Java class. This is one of the key concepts behind how JRuby’s implementation works: for every Ruby object instance there is an underlying Java object instance. I’ll have more about this in Chapter 3.

Next, note the arguments to op_plus are something called a ThreadContext and the operand of the addition operation, a Java object called other which implements the IRubyObject interface. Reading the code above, we can see that if the other operand is also an instance of RubyFixnum then JRuby will call the addFixnum method; here is that code:

private IRubyObject addFixnum(ThreadContext context,
                              RubyFixnum other) {
  long otherValue = other.value;
  long result = value + otherValue;
  if (additionOverflowed(value, otherValue, result)) {
    return addAsBignum(context, other);
  }
  return newFixnum(context.getRuntime(), result);
}

Here you can see the Java code calculates the actual result of the “2+2” operation: “result = value + otherValue.” If the result were too large to fit into a Fixnum object, JRuby would call the addAsBignum method instead. Finally JRuby creates a new Fixnum instance, sets its value to result or 4 and returns it.