Now that we've covered the fundamentals of Ruby, we can examine some of the common metaprogramming techniques that are used in Rails.
Although we write examples in Ruby, most of these techniques are applicable to any dynamic programming language. In fact, many of Ruby's metaprogramming idioms are shamelessly stolen from either Lisp, Smalltalk, or Perl.
Often we want to create an interface whose methods vary depending on some piece of runtime data.
The most prominent example of this in Rails is ActiveRecord's
attribute accessor methods. Method calls on an ActiveRecord object
(like person.name
) are translated
at runtime to attribute accesses. At the class-method level,
ActiveRecord offers extreme flexibility: Person.find_all_by_user_id_and_active(42
,
true) is translated into the appropriate SQL query, raising the
standard NoMethodError
exception
should those attributes not exist.
The magic behind this is Ruby's method_missing
method. When a nonexistent
method is called on an object, Ruby first checks that object's class
for a method_missing
method before
raising a NoMethodError.
method_missing
's first argument is the name of the method
called; the remainder of the arguments correspond to the arguments
passed to the method. Any block passed to the method is passed through
to method_missing
. So, a complete
method signature is:
def method_missing(method_id, *args, &block) ... end
There are several drawbacks to using method_missing
:
It is slower than conventional method lookup. Simple tests
indicate that method dispatch with method_missing
is at least two to three
times as expensive in time as conventional dispatch.
Since the methods being called never actually exist—they are just intercepted at the last step of the method lookup process—they cannot be documented or introspected as conventional methods can.
Because all dynamic methods must go through the method_missing
method, the body of that
method can become quite large if there are many different aspects
of the code that need to add methods dynamically.
Using method_missing
restricts compatibility with future versions of an API. Once you
rely on method_missing
to do
something interesting with undefined methods, introducing new
methods in a future API version can break your users'
expectations.
A good alternative is the approach taken by ActiveRecord's
generate_read_methods
feature.
Rather than waiting for method_missing
to intercept the calls,
ActiveRecord generates an implementation for the attribute setter and
reader methods so that they can be called via conventional method
dispatch.
This is a powerful method in general, and the dynamic nature of Ruby makes it possible to write methods that replace themselves with optimized versions of themselves when they are first called. This is used in Rails routing, which needs to be very fast; we will see that in action later in this chapter.
One powerful technique that encompasses some of the others is generative programming—code that writes code.
This technique can manifest in the simplest ways, such as writing a shell script to automate some tedious part of programming. For example, you may want to populate your test fixtures with a sample project for each user:
brad_project: id: 1 owner_id: 1 billing_status_id: 12 john_project: id: 2 owner_id: 2 billing_status_id: 4 ...
If this were a language without scriptable test fixtures, you might be writing these by hand. This gets messy when the data starts growing, and is next to impossible when the fixtures have strange dependencies on the source data. Naïve generative programming would have you writing a script to generate this fixture from the source. Although not ideal, this is a great improvement over writing the complete fixtures by hand. But this is a maintenance headache: you have to incorporate the script into your build process, and ensure that the fixture is regenerated when the source data changes.
This is rarely, if ever, needed in Ruby or Rails (thankfully). Almost every aspect of Rails application configuration is scriptable, due in large part to the use of internal domain-specific languages (DSLs). In an internal DSL, you have the full power of the Ruby language at your disposal, not just the particular interface the library author decided you should have.
Returning to the preceding example, ERb makes our job a lot
easier. We can inject arbitrary Ruby code into the YAML file above
using ERb's <% %> and <%=
%>
tags, including whatever logic we need:
<% User.find_all_by_active(true).each_with_index do |user, i| %> <%= user.login %>_project: id: <%= i %> owner_id: <%= user.id %> billing_status_id: <%= user.billing_status.id %> <% end %>
ActiveRecord's implementation of this handy trick couldn't be simpler:
yaml = YAML::load(erb_render(yaml_string))
using the helper method erb_render:
def erb_render(fixture_content) ERB.new(fixture_content).result end
Generative programming often uses either Module#define_method
or class_eval
and def
to create methods on-the-fly.
ActiveRecord uses this technique for attribute accessors; the generate_read_methods
feature defines the setter and reader methods as instance methods on the ActiveRecord
class in order to reduce the
number of times method_missing
(a
relatively expensive technique) is needed.
Continuations are a very powerful control-flow mechanism. A continuation represents a particular state of the call stack and lexical variables. It is a snapshot of a point in time when evaluating Ruby code. Unfortunately, the Ruby 1.8 implementation of continuations is so slow as to be unusable for many applications. The upcoming Ruby 1.9 virtual machines may improve this situation, but you should not expect good performance from continuations under Ruby 1.8. However, they are useful constructs, and continuation-based web frameworks provide an interesting alternative to frameworks like Rails, so we will survey their use here.
Continuations are powerful for several reasons:
Continuations are often described as "structured
GOTO
." As such, they should be
treated with the same caution as any kind of GOTO
construct. Continuations have little or no place inside application
code; they should usually be encapsulated within libraries. I don't
say this because I think developers should be protected from
themselves. Rather, continuations are general enough that it makes
more sense to build abstractions around them than to use them
directly. The idea is that a programmer should think "external
iterator" or "coroutine" (both abstractions built on top of
continuations) rather than "continuation" when building the
application software.
Seaside [10] is a Smalltalk web application framework built on top of continuations. Continuations are used in Seaside to manage session state. Each user session corresponds to a server-side continuation. When a request comes in, the continuation is invoked and more code is run. The upshot is that entire transactions can be written as a single stream of code, even if they span multiple HTTP requests. This power comes from the fact that Smalltalk's continuations are serializable; they can be written out to a database or to the filesystem, then thawed and reinvoked upon a request. Ruby's continuations are nonserializable. In Ruby, continuations are in-memory only and cannot be transformed into a byte stream.
Borges (http://borges.rubyforge.org/) is a straightforward port of Seaside 2 to Ruby. The major difference between Seaside and Borges is that Borges must store all current continuations in memory, as they are not serializable. This is a huge limitation that unfortunately prevents Borges from being successful for web applications with any kind of volume. If serializable continuations are implemented in one of the Ruby implementations, this limitation can be removed.
The power of continuations is evident in the following Borges sample code, which renders a list of items from an online store:
class SushiNet::StoreItemList < Borges::Component def choose(item) call SushiNet::StoreItemView.new(item) end def initialize(items) @batcher = Borges::BatchedList.new items, 8 end def render_content_on(r) r.list_do @batcher.batch do |item| r.anchor item.title do choose item end end r.render @batcher end end # class SushiNet::StoreItemList
The bulk of the action happens in the render_content_on
method, which uses a
BatchedList
(a paginator) to render
a paginated list of links to products. But the fun happens in the call
to anchor
, which stores away the
call to choose, to be executed when the corresponding link is
clicked.
However, there is still vast disagreement on how useful continuations are for web programming. HTTP was designed as a stateless protocol, and continuations for web transactions are the polar opposite of statelessness. All of the continuations must be stored on the server, which takes additional memory and disk space. Sticky sessions are required, to direct a user's traffic to the same server. As a result, if one server goes down, all of its sessions are lost. The most popular Seaside application, DabbleDB (http://dabbledb.com/), actually uses continuations very little.
Bindings provide context for evaluation of Ruby code. A binding is the set of variables and methods that are
available at a particular (lexical) point in the code. Any place in Ruby code where
statements may be evaluated has a binding, and that binding can be obtained with Kernel#binding
. Bindings are just objects of class Binding, and they can
be passed around as any objects can:
class C binding # => #<Binding:0x2533c> def a_method binding end end binding # => #<Binding:0x252b0> C.new.a_method # => #<Binding:0x25238>
The Rails scaffold generator provides a good example of the use of bindings:
class ScaffoldingSandbox include ActionView::Helpers::ActiveRecordHelper attr_accessor :form_action, :singular_name, :suffix, :model_instance def sandbox_binding binding end # ... end
ScaffoldingSandbox
is a class
that provides a clean environment from which to render a template. ERb
can render templates within the context of a binding, so that an API
is available from within the ERb templates.
part_binding = template_options[:sandbox].call.sandbox_binding # ... ERB.new(File.readlines(part_path).join,nil,'-').result(part_binding)
Earlier I mentioned that blocks are closures. A closure's
binding represents its state—the set of variables and methods it has
access to. We can get at a closure's binding with the Proc#binding
method:
def var_from_binding(&b) eval("var", b.binding) end var = 123 var_from_binding {} # => 123 var = 456 var_from_binding {} # => 456
Here we are only using the Proc
as a method by which to get the
binding. By accessing the binding (context) of those blocks, we can
access the local variable var
with
a simple eval
against the
binding.
Ruby provides many methods for looking into objects at runtime. There are object methods to access instance variables. These methods break encapsulation, so use them with care.
class C def initialize @ivar = 1 end end c = C.new c.instance_variables # => ["@ivar"] c.instance_variable_get(:@ivar) # => 1 c.instance_variable_set(:@ivar, 3) # => 3 c.instance_variable_get(:@ivar) # => 3
The Object#methods
method
returns an array of instance methods, including singleton methods, defined on the receiver. If the first parameter
to methods
is false
, only the object's singleton methods
are returned.
class C def inst_method end def self.cls_method end end c = C.new class << c def singleton_method end end c.methods - Object.methods # => ["inst_method", "singleton_method"] c.methods(false) # => ["singleton_method"]
Module#instance_methods
returns an array of the class or module's instance methods. Note that
instance_methods
is called on the
class, while methods
is called on
an instance. Passing false
to
instance_methods
skips the
superclasses' methods:
C.instance_methods(false) # => ["inst_method"]
We can also use Metaid's metaclass
method to examine C
's class methods:
C.metaclass.instance_methods(false) # => ["new", "allocate", "cls_method", "superclass"]
In my experience, most of the value from these methods is in satisfying curiosity. With the exception of a few well-established idioms, there is rarely a need in production code to reflect on an object's methods. Far more often, these techniques can be used at a console prompt to find methods available on an object—it's usually quicker than reaching for a reference book:
Array.instance_methods.grep /sort/ # => ["sort!", "sort", "sort_by"]
ObjectSpace
is a module
used to interact with Ruby's object system. It has a few useful
module methods that can make low-level hacking easier:
Garbage-collection methods: define_finalizer
(sets up a callback
to be called just before an object is destroyed), undefine_finalizer
(removes those
call-backs), and garbage_collect (starts garbage collection).
_id2ref
converts an
object's ID to a reference to that Ruby object.
each_object
iterates
through all objects (or all objects of a certain class) and
yields them to a block.
As always, with great power comes great responsibility. Although these methods can be useful, they can also be dangerous. Use them judiciously.
An example of the proper use of ObjectSpace
is found in Ruby's Test::Unit
frame-work. This code uses
ObjectSpace.each_object
to
enumerate all classes in existence that inherit from Test::Unit::TestCase:
test_classes = [] ObjectSpace.each_object(Class) { | klass | test_classes << klass if (Test::Unit::TestCase > klass) }
ObjectSpace
, unfortunately,
greatly complicates some Ruby virtual machines. In particular, JRuby
performance suffers tremendously when ObjectSpace
is enabled, because the Ruby
interpreter cannot directly examine the JVM's heap for extant
objects. Instead, JRuby must keep track of objects manually, which
adds a great amount of overhead. As the same tricks can be achieved
with methods like Module.extended
and Class.inherited
, there are
not many cases where ObjectSpace
is genuinely necessary.
Delegation is a form of composition. It is similar to inheritance, except with more conceptual "space" between the objects being composed. Delegation implies a "has-a" rather than an "is-a" relationship. When one object delegates to another, there are two objects in existence, rather than the one object that would result from an inheritance hierarchy.
Delegation is used in ActiveRecord's associations. The AssociationProxy
class delegates most
methods (including class
) to its
target. In this way, associations can be lazily loaded (not loaded
until their data is needed) with a completely transparent
interface.
Ruby's standard library includes facilities for delegation.
The simplest is DelegateClass. By inheriting from DelegateClass(klass)
and calling super(instance)
in the constructor, a
class delegates any unknown method calls to the provided instance of
the class klass
. As an example,
consider a Settings
class that
delegates to a hash:
require 'delegate' class Settings < DelegateClass(Hash) def initialize(options = {}) super({:initialized_at => Time.now - 5}.merge(options)) end def age Time.now - self[:initialized_at] end end settings = Settings.new :use_foo_bar => true # Method calls are delegated to the object settings[:use_foo_bar] # => true settings.age # => 5.000301
The Settings
constructor
calls super
to set the delegated
object to a new hash. Note the difference between composition and
inheritance: if we had inherited from Hash
, then Settings
would be a
hash; in this case, Settings
has a hash and delegates to it. This
composition relationship offers increased flexibility, especially
when the object to be delegated to may change (a function provided
by SimpleDelegator
).
The Ruby standard library also includes Forwardable
, which provides a simple
interface by which individual methods, rather than all undefined methods, can be
delegated to another object. ActiveSupport in Rails provides similar
functionality with a cleaner API through Module#delegate
:
class User < ActiveRecord::Base belongs_to :person delegate :first_name, :last_name, :phone, :to => :person end
In Ruby, all classes are open. Any object or class is fair game to be modified at any time. This gives many opportunities for extending or overriding existing functionality. This extension can be done very cleanly, without modifying the original definitions.
Rails takes advantage of Ruby's open class system extensively. Opening classes and adding code is referred to as monkeypatching (a term from the Python community). Though it sounds derogatory, this term is used in a decidedly positive light; monkey-patching is, on the whole, seen as an incredibly useful technique. Almost all Rails plugins monkeypatch the Rails core in some way or another.
There are two primary disadvantages to monkeypatching. First,
the code for one method call may be spread over several files. The
foremost example of this is in ActionController's process
method. This method is intercepted
by methods in up to five different files during the course of a
request. Each of these methods adds another feature: filters,
exception rescue, components, and session management. The end result
is a net gain: the benefit gained by separating each functional
component into a separate file outweighs the inflated call
stack.
Another consequence of the functionality being spread around is that it can be
difficult to properly document a method. Because the function of the
process
method can change depending
on which code has been loaded, there is no good place to document what
each of the methods is adding. This problem exists because the
actual identity of the process
method changes as the methods are chained together.
Because Rails encourages the philosophy of separation of concerns, you often will have the need to extend the functionality of existing code. Many times you will want to "patch" a feature onto an existing function without disturbing that function's code. Your addition may not be directly related to the function's original purpose: it may add authentication, logging, or some other important cross-cutting concern.
We will examine several approaches to the problem of cross-cutting concerns, and explain the one (method chaining) that has acquired the most momentum in the Ruby and Rails communities.
In traditional object-oriented programming, a class can be extended by inheriting from it and changing its data or behavior. This paradigm works for many purposes, but it has drawbacks:
The changes you want to make may be small, in which case setting up a new class may be overly complex. Each new class in an inheritance hierarchy adds to the mental overhead required to understand the code.
You may need to make a series of related changes to several otherwise-unrelated classes. Subclassing each one individually would be overkill and would separate functionality that should be kept together.
The class may already be in use throughout an application, and you want to change its behavior globally.
You may want to add or remove a feature at runtime, and have it take effect globally. (We will explore this technique with a full example later in the chapter.)
In more traditional object-oriented languages, these features would require complex code. Not only would the code be complex, it would be tightly coupled to either the existing code or the code that calls it.
Aspect-oriented programming (AOP) is one technique that attempts to solve the issues of cross-cutting concerns. There has been much talk about the applicability of AOP to Ruby, since many of the advantages that AOP provides can already be obtained through metaprogramming. There is a Ruby proposal for cut-based AOP, [11] but it may be months or years before this is incorporated.
In cut-based AOP, cuts are sometimes called "transparent subclasses" because they extend a class's functionality in a modular way. Cuts act as subclasses but without the need to instantiate the subclass rather than the parent class.
The Ruby Facets library (facets.rubyforge.org) includes a pure-Ruby cut-based AOP library. http://facets.rubyforge.org/api/more/classes/Cut.html It has some limitations due to being written purely in Ruby, but the usage is fairly clean:
class Person def say_hi puts "Hello!" end end cut :Tracer < Person do def say_hi puts "Before method" super puts "After method" end end Person.new.say_hi # >> Before method # >> Hello! # >> After method
Here we see that the Tracer
cut is a transparent subclass: when we create an instance of
Person
, it is affected by
Tracer
without having to know
about Tracer
. We can also change
Person#say_hi
without disrupting
our cut.
For whatever reason, Ruby AOP techniques have not taken off. We will now introduce the standard way to deal with separation of concerns in Ruby.
The standard Ruby solution to this problem is method
chaining: aliasing an existing method to a new name and
overwriting its old definition with a new body. This new body
usually calls the old method definition by referring to the aliased
name (the equivalent of calling super
in an inherited overriden method).
The effect is that a feature can be patched around an existing
method. Due to Ruby's open class nature, features can be added to
almost any code from anywhere. Needless to say, this must be done
wisely so as to retain clarity.
There is a standard Ruby idiom for chaining methods. Assume we have some library code that grabs a
Person
object from across the
network:
class Person def refresh # (get data from server) end end
This operation takes quite a while, and we would like to time
it and log the results. Leveraging Ruby's open classes, we can just
open up the Person
class again
and monkeypatch the logging code into refresh
:
class Person def refresh_with_timing start_time = Time.now.to_f retval = refresh_without_timing end_time = Time.now.to_f logger.info "Refresh: #{"%.3f" % (end_time-start_time)} s." retval end alias_method :refresh_without_timing, :refresh alias_method :refresh, :refresh_with_timing end
We can put this code in a separate file (perhaps alongside
other timing code), and, as long as we require
it after the original definition
of refresh
, the timing code will
be properly added around the original method call. This aids in
separation of concerns because we can separate code into different
files based on its functional concern, not necessarily based on the
area that it modifies.
The two alias_method
calls
patch around the original call to refresh
, adding our timing code. The first
call aliases the original method as refresh_without_timing
(giving us a name
by which to call the original method from refresh_with_timing
); the second method
points refresh
at our new
method.
This paradigm of using a two alias_method
calls to add a feature is
common enough that it has a name in Rails: alias_method_chain
. It takes two
arguments: the name of the original method and the name of the
feature.
Using alias_method_chain
,
we can now collapse the two alias_method
calls into one simple
line:
alias_method_chain :refresh, :timing
Monkeypatching affords us a lot of power, but it pollutes the namespace of the patched class. Things can often be made cleaner by modulizing the additions and inserting the module in the class's lookup chain. Tobias Lütke's Active Merchant Rails plugin uses this approach for the view helpers. First, a module is created with the helper method:
module ActiveMerchant module Billing module Integrations module ActionViewHelper def payment_service_for(order, account, options = {}, &proc) ... end end end end end
Then, in the plugin's init.rb script, the
module is included in ActionView::Base
:
require 'active_merchant/billing/integrations/action_view_helper' ActionView::Base.send(:include, ActiveMerchant::Billing::Integrations::ActionViewHelper)
It certainly would be simpler in code to directly open ActionView::Base
and add the method, but
this has the advantage of modularity. All Active Merchant code is
contained within the ActiveMerchant
module.
There is one caveat to this approach. Because any included modules are searched for methods after the class's own methods are searched, you cannot directly overwrite a class's methods by including a module:
module M def test_method "Test from M" end end class C def test_method "Test from C" end end C.send(:include, M) C.new.test_method # => "Test from C"
Instead, you should create a new name in the module and use
alias_method_chain
:
module M def test_method_with_module "Test from M" end end class C def test_method "Test from C" end end # for a plugin, these two lines would go in init.rb C.send(:include, M) C.class_eval { alias_method_chain :test_method, :module } C.new.test_method # => "Test from M"