Metaprogramming Techniques

Now that we've covered the fundamentals of Ruby, we can examine some of the common metaprogramming techniques that are used in Rails.

Although we write examples in Ruby, most of these techniques are applicable to any dynamic programming language. In fact, many of Ruby's metaprogramming idioms are shamelessly stolen from either Lisp, Smalltalk, or Perl.

Delaying Method Lookup Until Runtime

Often we want to create an interface whose methods vary depending on some piece of runtime data. The most prominent example of this in Rails is ActiveRecord's attribute accessor methods. Method calls on an ActiveRecord object (like person.name) are translated at runtime to attribute accesses. At the class-method level, ActiveRecord offers extreme flexibility: Person.find_all_by_user_id_and_active(42, true) is translated into the appropriate SQL query, raising the standard NoMethodError exception should those attributes not exist.

The magic behind this is Ruby's method_missing method. When a nonexistent method is called on an object, Ruby first checks that object's class for a method_missing method before raising a NoMethodError. method_missing's first argument is the name of the method called; the remainder of the arguments correspond to the arguments passed to the method. Any block passed to the method is passed through to method_missing. So, a complete method signature is:

	def method_missing(method_id, *args, &block)
	  ...
	end

There are several drawbacks to using method_missing:

It is slower than conventional method lookup. Simple tests indicate that method dispatch with method_missing is at least two to three times as expensive in time as conventional dispatch.

Since the methods being called never actually exist—they are just intercepted at the last step of the method lookup process—they cannot be documented or introspected as conventional methods can.

Because all dynamic methods must go through the method_missing method, the body of that method can become quite large if there are many different aspects of the code that need to add methods dynamically.
Using method_missing restricts compatibility with future versions of an API. Once you rely on method_missing to do something interesting with undefined methods, introducing new methods in a future API version can break your users' expectations.

A good alternative is the approach taken by ActiveRecord's generate_read_methods feature. Rather than waiting for method_missing to intercept the calls, ActiveRecord generates an implementation for the attribute setter and reader methods so that they can be called via conventional method dispatch.

This is a powerful method in general, and the dynamic nature of Ruby makes it possible to write methods that replace themselves with optimized versions of themselves when they are first called. This is used in Rails routing, which needs to be very fast; we will see that in action later in this chapter.

Generative Programming: Writing Code On-the-Fly

One powerful technique that encompasses some of the others is generative programming—code that writes code.

This technique can manifest in the simplest ways, such as writing a shell script to automate some tedious part of programming. For example, you may want to populate your test fixtures with a sample project for each user:

	brad_project:
	  id: 1
	  owner_id: 1
	  billing_status_id: 12

	john_project:
	  id: 2
	  owner_id: 2
	  billing_status_id: 4

	...

If this were a language without scriptable test fixtures, you might be writing these by hand. This gets messy when the data starts growing, and is next to impossible when the fixtures have strange dependencies on the source data. Naïve generative programming would have you writing a script to generate this fixture from the source. Although not ideal, this is a great improvement over writing the complete fixtures by hand. But this is a maintenance headache: you have to incorporate the script into your build process, and ensure that the fixture is regenerated when the source data changes.

This is rarely, if ever, needed in Ruby or Rails (thankfully). Almost every aspect of Rails application configuration is scriptable, due in large part to the use of internal domain-specific languages (DSLs). In an internal DSL, you have the full power of the Ruby language at your disposal, not just the particular interface the library author decided you should have.

Returning to the preceding example, ERb makes our job a lot easier. We can inject arbitrary Ruby code into the YAML file above using ERb's <% %> and <%= %> tags, including whatever logic we need:

	<% User.find_all_by_active(true).each_with_index do |user, i| %>
	<%= user.login %>_project:
	     id: <%= i %>
	     owner_id: <%= user.id %>
	     billing_status_id: <%= user.billing_status.id %>
    
	<% end %>

ActiveRecord's implementation of this handy trick couldn't be simpler:

	yaml = YAML::load(erb_render(yaml_string))

using the helper method erb_render:

	def erb_render(fixture_content)
	  ERB.new(fixture_content).result
	end

Generative programming often uses either Module#define_method or class_eval and def to create methods on-the-fly. ActiveRecord uses this technique for attribute accessors; the generate_read_methods feature defines the setter and reader methods as instance methods on the ActiveRecord class in order to reduce the number of times method_missing (a relatively expensive technique) is needed.

Continuations

Continuations are a very powerful control-flow mechanism. A continuation represents a particular state of the call stack and lexical variables. It is a snapshot of a point in time when evaluating Ruby code. Unfortunately, the Ruby 1.8 implementation of continuations is so slow as to be unusable for many applications. The upcoming Ruby 1.9 virtual machines may improve this situation, but you should not expect good performance from continuations under Ruby 1.8. However, they are useful constructs, and continuation-based web frameworks provide an interesting alternative to frameworks like Rails, so we will survey their use here.

Continuations are powerful for several reasons:

Continuations are just objects; they can be passed around from function to function.
Continuations can be invoked from anywhere. If you hold a reference to a continuation, you can invoke it.
Continuations are re-entrant. You can use continuations to return from a function multiple times.

Continuations are often described as "structured GOTO." As such, they should be treated with the same caution as any kind of GOTO construct. Continuations have little or no place inside application code; they should usually be encapsulated within libraries. I don't say this because I think developers should be protected from themselves. Rather, continuations are general enough that it makes more sense to build abstractions around them than to use them directly. The idea is that a programmer should think "external iterator" or "coroutine" (both abstractions built on top of continuations) rather than "continuation" when building the application software.

Seaside ^[10] is a Smalltalk web application framework built on top of continuations. Continuations are used in Seaside to manage session state. Each user session corresponds to a server-side continuation. When a request comes in, the continuation is invoked and more code is run. The upshot is that entire transactions can be written as a single stream of code, even if they span multiple HTTP requests. This power comes from the fact that Smalltalk's continuations are serializable; they can be written out to a database or to the filesystem, then thawed and reinvoked upon a request. Ruby's continuations are nonserializable. In Ruby, continuations are in-memory only and cannot be transformed into a byte stream.

Borges (http://borges.rubyforge.org/) is a straightforward port of Seaside 2 to Ruby. The major difference between Seaside and Borges is that Borges must store all current continuations in memory, as they are not serializable. This is a huge limitation that unfortunately prevents Borges from being successful for web applications with any kind of volume. If serializable continuations are implemented in one of the Ruby implementations, this limitation can be removed.

The power of continuations is evident in the following Borges sample code, which renders a list of items from an online store:

	class SushiNet::StoreItemList < Borges::Component

	  def choose(item)
	    call SushiNet::StoreItemView.new(item)
	  end

	  def initialize(items)
	    @batcher = Borges::BatchedList.new items, 8
	  end 

	  def render_content_on(r)
	    r.list_do @batcher.batch do |item|
	      r.anchor item.title do choose item end
	  end

	    r.render @batcher
      end

  end # class SushiNet::StoreItemList

The bulk of the action happens in the render_content_on method, which uses a BatchedList (a paginator) to render a paginated list of links to products. But the fun happens in the call to anchor, which stores away the call to choose, to be executed when the corresponding link is clicked.

However, there is still vast disagreement on how useful continuations are for web programming. HTTP was designed as a stateless protocol, and continuations for web transactions are the polar opposite of statelessness. All of the continuations must be stored on the server, which takes additional memory and disk space. Sticky sessions are required, to direct a user's traffic to the same server. As a result, if one server goes down, all of its sessions are lost. The most popular Seaside application, DabbleDB (http://dabbledb.com/), actually uses continuations very little.

Bindings

Bindings provide context for evaluation of Ruby code. A binding is the set of variables and methods that are available at a particular (lexical) point in the code. Any place in Ruby code where statements may be evaluated has a binding, and that binding can be obtained with Kernel#binding. Bindings are just objects of class Binding, and they can be passed around as any objects can:

	class C
	  binding # => #<Binding:0x2533c>
	  def a_method
	    binding
	  end
	end
	binding # => #<Binding:0x252b0>
	C.new.a_method # => #<Binding:0x25238>

The Rails scaffold generator provides a good example of the use of bindings:

	class ScaffoldingSandbox
	  include ActionView::Helpers::ActiveRecordHelper
	  attr_accessor :form_action, :singular_name, :suffix, :model_instance

	  def sandbox_binding
	    binding
	  end

	  # ...
	end

ScaffoldingSandbox is a class that provides a clean environment from which to render a template. ERb can render templates within the context of a binding, so that an API is available from within the ERb templates.

	part_binding = template_options[:sandbox].call.sandbox_binding
	# ...
	ERB.new(File.readlines(part_path).join,nil,'-').result(part_binding)

Earlier I mentioned that blocks are closures. A closure's binding represents its state—the set of variables and methods it has access to. We can get at a closure's binding with the Proc#binding method:

	def var_from_binding(&b)
	  eval("var", b.binding)
	end
	
	var = 123
	var_from_binding {} # => 123
	var = 456
	var_from_binding {} # => 456

Here we are only using the Proc as a method by which to get the binding. By accessing the binding (context) of those blocks, we can access the local variable var with a simple eval against the binding.

Introspection and ObjectSpace: Examining Data and Methods at Runtime

Ruby provides many methods for looking into objects at runtime. There are object methods to access instance variables. These methods break encapsulation, so use them with care.

	class C
	  def initialize
	    @ivar = 1
	  end
	end

	c = C.new
	c.instance_variables              # => ["@ivar"]
	c.instance_variable_get(:@ivar)   # => 1
	
	c.instance_variable_set(:@ivar, 3) # => 3
	c.instance_variable_get(:@ivar)    # => 3

The Object#methods method returns an array of instance methods, including singleton methods, defined on the receiver. If the first parameter to methods is false, only the object's singleton methods are returned.

	class C
	  def inst_method
	  end

	  def self.cls_method
	  end
	end

	c = C.new

	class << c
	  def singleton_method
	  end
	end

	c.methods - Object.methods # => ["inst_method", "singleton_method"]
	c.methods(false) # => ["singleton_method"]

Module#instance_methods returns an array of the class or module's instance methods. Note that instance_methods is called on the class, while methods is called on an instance. Passing false to instance_methods skips the superclasses' methods:

	C.instance_methods(false) # => ["inst_method"]

We can also use Metaid's metaclass method to examine C's class methods:

	C.metaclass.instance_methods(false) # => ["new", "allocate", "cls_method",
	                                          "superclass"]

In my experience, most of the value from these methods is in satisfying curiosity. With the exception of a few well-established idioms, there is rarely a need in production code to reflect on an object's methods. Far more often, these techniques can be used at a console prompt to find methods available on an object—it's usually quicker than reaching for a reference book:

	Array.instance_methods.grep /sort/ # => ["sort!", "sort", "sort_by"]

ObjectSpace

ObjectSpace is a module used to interact with Ruby's object system. It has a few useful module methods that can make low-level hacking easier:

Garbage-collection methods: define_finalizer (sets up a callback to be called just before an object is destroyed), undefine_finalizer (removes those call-backs), and garbage_collect (starts garbage collection).
_id2ref converts an object's ID to a reference to that Ruby object.
each_object iterates through all objects (or all objects of a certain class) and yields them to a block.

As always, with great power comes great responsibility. Although these methods can be useful, they can also be dangerous. Use them judiciously.

An example of the proper use of ObjectSpace is found in Ruby's Test::Unit frame-work. This code uses ObjectSpace.each_object to enumerate all classes in existence that inherit from Test::Unit::TestCase:

	test_classes = []
	ObjectSpace.each_object(Class) {
	  | klass |
	  test_classes << klass if (Test::Unit::TestCase > klass)
	}

ObjectSpace, unfortunately, greatly complicates some Ruby virtual machines. In particular, JRuby performance suffers tremendously when ObjectSpace is enabled, because the Ruby interpreter cannot directly examine the JVM's heap for extant objects. Instead, JRuby must keep track of objects manually, which adds a great amount of overhead. As the same tricks can be achieved with methods like Module.extended and Class.inherited, there are not many cases where ObjectSpace is genuinely necessary.

Delegation with Proxy Classes

Delegation is a form of composition. It is similar to inheritance, except with more conceptual "space" between the objects being composed. Delegation implies a "has-a" rather than an "is-a" relationship. When one object delegates to another, there are two objects in existence, rather than the one object that would result from an inheritance hierarchy.

Delegation is used in ActiveRecord's associations. The AssociationProxy class delegates most methods (including class) to its target. In this way, associations can be lazily loaded (not loaded until their data is needed) with a completely transparent interface.

DelegateClass and Forwardable

Ruby's standard library includes facilities for delegation. The simplest is DelegateClass. By inheriting from DelegateClass(klass) and calling super(instance) in the constructor, a class delegates any unknown method calls to the provided instance of the class klass. As an example, consider a Settings class that delegates to a hash:

	require 'delegate'
	class Settings < DelegateClass(Hash)
	  def initialize(options = {})
	    super({:initialized_at => Time.now - 5}.merge(options))
	  end

	  def age
	    Time.now - self[:initialized_at]
	  end
	end

	settings = Settings.new :use_foo_bar => true
	
	# Method calls are delegated to the object
	settings[:use_foo_bar] # => true
	settings.age # => 5.000301

The Settings constructor calls super to set the delegated object to a new hash. Note the difference between composition and inheritance: if we had inherited from Hash, then Settings would be a hash; in this case, Settings has a hash and delegates to it. This composition relationship offers increased flexibility, especially when the object to be delegated to may change (a function provided by SimpleDelegator).

The Ruby standard library also includes Forwardable, which provides a simple interface by which individual methods, rather than all undefined methods, can be delegated to another object. ActiveSupport in Rails provides similar functionality with a cleaner API through Module#delegate:

	class User < ActiveRecord::Base
	  belongs_to :person

	  delegate :first_name, :last_name, :phone, :to => :person
	end

Monkeypatching

In Ruby, all classes are open. Any object or class is fair game to be modified at any time. This gives many opportunities for extending or overriding existing functionality. This extension can be done very cleanly, without modifying the original definitions.

Rails takes advantage of Ruby's open class system extensively. Opening classes and adding code is referred to as monkeypatching (a term from the Python community). Though it sounds derogatory, this term is used in a decidedly positive light; monkey-patching is, on the whole, seen as an incredibly useful technique. Almost all Rails plugins monkeypatch the Rails core in some way or another.

Disadvantages of monkeypatching

There are two primary disadvantages to monkeypatching. First, the code for one method call may be spread over several files. The foremost example of this is in ActionController's process method. This method is intercepted by methods in up to five different files during the course of a request. Each of these methods adds another feature: filters, exception rescue, components, and session management. The end result is a net gain: the benefit gained by separating each functional component into a separate file outweighs the inflated call stack.

Another consequence of the functionality being spread around is that it can be difficult to properly document a method. Because the function of the process method can change depending on which code has been loaded, there is no good place to document what each of the methods is adding. This problem exists because the actual identity of the process method changes as the methods are chained together.

Adding Functionality to Existing Methods

Because Rails encourages the philosophy of separation of concerns, you often will have the need to extend the functionality of existing code. Many times you will want to "patch" a feature onto an existing function without disturbing that function's code. Your addition may not be directly related to the function's original purpose: it may add authentication, logging, or some other important cross-cutting concern.

We will examine several approaches to the problem of cross-cutting concerns, and explain the one (method chaining) that has acquired the most momentum in the Ruby and Rails communities.

Subclassing

In traditional object-oriented programming, a class can be extended by inheriting from it and changing its data or behavior. This paradigm works for many purposes, but it has drawbacks:

The changes you want to make may be small, in which case setting up a new class may be overly complex. Each new class in an inheritance hierarchy adds to the mental overhead required to understand the code.
You may need to make a series of related changes to several otherwise-unrelated classes. Subclassing each one individually would be overkill and would separate functionality that should be kept together.
The class may already be in use throughout an application, and you want to change its behavior globally.
You may want to add or remove a feature at runtime, and have it take effect globally. (We will explore this technique with a full example later in the chapter.)

In more traditional object-oriented languages, these features would require complex code. Not only would the code be complex, it would be tightly coupled to either the existing code or the code that calls it.

Aspect-oriented programming

Aspect-oriented programming (AOP) is one technique that attempts to solve the issues of cross-cutting concerns. There has been much talk about the applicability of AOP to Ruby, since many of the advantages that AOP provides can already be obtained through metaprogramming. There is a Ruby proposal for cut-based AOP, ^[11] but it may be months or years before this is incorporated.

In cut-based AOP, cuts are sometimes called "transparent subclasses" because they extend a class's functionality in a modular way. Cuts act as subclasses but without the need to instantiate the subclass rather than the parent class.

The Ruby Facets library (facets.rubyforge.org) includes a pure-Ruby cut-based AOP library. http://facets.rubyforge.org/api/more/classes/Cut.html It has some limitations due to being written purely in Ruby, but the usage is fairly clean:

	class Person
	  def say_hi
	    puts "Hello!"
	  end
	end

	cut :Tracer < Person do
	  def say_hi
	    puts "Before method"
	    super 
	    puts "After method"
	  end
	end

	Person.new.say_hi
	# >> Before method
	# >> Hello!
	# >> After method

Here we see that the Tracer cut is a transparent subclass: when we create an instance of Person, it is affected by Tracer without having to know about Tracer. We can also change Person#say_hi without disrupting our cut.

For whatever reason, Ruby AOP techniques have not taken off. We will now introduce the standard way to deal with separation of concerns in Ruby.

Method chaining

The standard Ruby solution to this problem is method chaining: aliasing an existing method to a new name and overwriting its old definition with a new body. This new body usually calls the old method definition by referring to the aliased name (the equivalent of calling super in an inherited overriden method). The effect is that a feature can be patched around an existing method. Due to Ruby's open class nature, features can be added to almost any code from anywhere. Needless to say, this must be done wisely so as to retain clarity.

There is a standard Ruby idiom for chaining methods. Assume we have some library code that grabs a Person object from across the network:

	class Person
	  def refresh
	    # (get data from server)
	  end
	end

This operation takes quite a while, and we would like to time it and log the results. Leveraging Ruby's open classes, we can just open up the Person class again and monkeypatch the logging code into refresh:

	class Person
	  def refresh_with_timing
	    start_time = Time.now.to_f
	    retval = refresh_without_timing
	    end_time = Time.now.to_f
	    logger.info "Refresh: #{"%.3f" % (end_time-start_time)} s."
	    retval
	end
	  
	  alias_method :refresh_without_timing, :refresh
	  alias_method :refresh, :refresh_with_timing
	end

We can put this code in a separate file (perhaps alongside other timing code), and, as long as we require it after the original definition of refresh, the timing code will be properly added around the original method call. This aids in separation of concerns because we can separate code into different files based on its functional concern, not necessarily based on the area that it modifies.

The two alias_method calls patch around the original call to refresh, adding our timing code. The first call aliases the original method as refresh_without_timing (giving us a name by which to call the original method from refresh_with_timing); the second method points refresh at our new method.

This paradigm of using a two alias_method calls to add a feature is common enough that it has a name in Rails: alias_method_chain. It takes two arguments: the name of the original method and the name of the feature.

Using alias_method_chain, we can now collapse the two alias_method calls into one simple line:

	alias_method_chain :refresh, :timing

Modulization

Monkeypatching affords us a lot of power, but it pollutes the namespace of the patched class. Things can often be made cleaner by modulizing the additions and inserting the module in the class's lookup chain. Tobias Lütke's Active Merchant Rails plugin uses this approach for the view helpers. First, a module is created with the helper method:

	module ActiveMerchant
	  module Billing
	    module Integrations
	      module ActionViewHelper
	        def payment_service_for(order, account, options = {}, &proc)
	          ...
	        end
	      end
	    end
	  end
	end

Then, in the plugin's init.rb script, the module is included in ActionView::Base:

	require 'active_merchant/billing/integrations/action_view_helper' 
	ActionView::Base.send(:include, 
	  ActiveMerchant::Billing::Integrations::ActionViewHelper)

It certainly would be simpler in code to directly open ActionView::Base and add the method, but this has the advantage of modularity. All Active Merchant code is contained within the ActiveMerchant module.

There is one caveat to this approach. Because any included modules are searched for methods after the class's own methods are searched, you cannot directly overwrite a class's methods by including a module:

	module M
	  def test_method
	    "Test from M"
	  end
	end

	class C
	  def test_method
	    "Test from C"
	  end
	end

	C.send(:include, M)
	C.new.test_method # => "Test from C"

Instead, you should create a new name in the module and use alias_method_chain:

	module M
	  def test_method_with_module
	    "Test from M"
	  end
	end

	class C
	  def test_method
	    "Test from C"
	  end
	end

	# for a plugin, these two lines would go in init.rb
	C.send(:include, M)
	C.class_eval { alias_method_chain :test_method, :module }

	C.new.test_method # => "Test from M"

^[10] http://seaside.st/

^[11]http://wiki.rubygarden.org/Ruby/page/show/AspectOrientedRuby