Chapter 11

Scripting

Up until this chapter, I have focused on general aspects of API design that could be applicable to any C++ project. Having covered the standard API design pipeline, the remaining chapters in this book deal with the more specialized topics of scripting and extensibility. While not all APIs need to be concerned with these topics, they are becoming more popular subjects in modern application development. I therefore felt that a comprehensive book on C++ API design should include coverage of these advanced topics.

Accordingly, this chapter deals with the topic of scripting, that is, allowing your C++ API to be accessed from a scripting language, such as Python, Ruby, Lua, Tcl, or Perl. I will explain why you might want to do this and what some of the issues are that you need to be aware of, and then review some of the main technologies that let you create bindings for these languages.

To make this chapter more practical and instructive, I will take a detailed look at two different script binding technologies and show how these can be used to create bindings for two different scripting languages. Specifically, I will provide an in-depth treatment of how to create Python bindings for your C++ API using Boost Python, followed by a thorough analysis of how to create Ruby bindings using the Simplified Wrapper and Interface Generator (SWIG). I have chosen to focus on Python and Ruby because these are two of the most popular scripting languages in use today; in terms of binding technologies, Boost Python and SWIG are both freely available open source solutions that provide extensive control over the resulting bindings.

11.1 Adding Script Bindings

11.1.1 Extending versus Embedding

A script binding provides a way to access a C++ API from a scripting language. This normally involves creating wrapper code for the C++ classes and functions that allow them to be imported into the scripting language using its native module loading features, for example, the import keyword in Python, require in Ruby, or use in Perl.

There are two main strategies for integrating C++ with a scripting language:

1. Extending the language. In this model, a script binding is provided as a module that supplements the functionality of the scripting language. That is, users who write code with the scripting language can use your module in their own scripts. Your module will look just like any other module for that language. For example, the expat and md5 modules in the Python standard library are implemented in C, not Python.

2. Embedding within an application. In this case, an end-user C++ application embeds a scripting language inside of it. Script bindings are then used to let end users write scripts for that specific application that call down into the core functionality of the program. Examples of this include the Autodesk Maya 3D modeling system, which offers Python and Maya Embedded Language (MEL) scripting, and the Adobe Director multimedia authoring platform, which embeds the Lingo scripting language.

Whichever strategy applies to your situation, the procedure to define and build script bindings is the same in each case. The only thing that really changes is who owns the C++ main() function.

11.1.2 Advantages of Scripting

The decision to provide access to native code APIs from within a scripting language offers many advantages. These advantages can either apply directly to you, if you provide a supported script binding for your C++ API, or to your clients, who may create their own script bindings on top of your C++-only API. I enumerate a few of these benefits here.

• Cross-platform. Scripting languages are interpreted, meaning that they execute plain ASCII source code or platform-independent byte code. They will also normally provide their own modules to interface with platform-specific features, such as the file system. Writing code for a scripting language should therefore work on multiple platforms without modification. This can also be considered a disadvantage for proprietary projects because scripting code will normally have to be distributed in source form.

• Faster development. If you make a change to a C++ program, you have to compile and link your code again. For large systems, this can be a time-consuming operation and can fracture engineer productivity as they wait to be able to test their change. In a scripting language, you simply edit the source code and then run it: there is no compile and link stage. This allows you to prototype and test new changes quickly, often resulting in greater engineer efficiency and project velocity.

• Write less code. A given problem can normally be solved with less code in a higher-level scripting language versus C++. Scripting languages don’t require explicit memory management, they tend to have a much larger standard library available than C++’s STL, and they often take care of complex concepts such as reference counting behind the scenes. For example, the following single line of Ruby code will take a string and return an alphabetized list of all the unique letters in that string. This would take a lot more code to implement in C++.

 “Hello World”.downcase.split(“”).uniq.sort.join

 => “dehlorw”

• Script-based applications. The traditional view of scripting languages is that you use them for small command-line tasks, but you must write an end-user application in a language such as C++ for maximum efficiency. However, an alternative view is that you can write the core performance-critical routines in C++, create script bindings for them, and then write your application in a scripting language. In Model–View–Controller parlance, the Model and potentially also the View are written in C++, whereas the Controller is implemented using a scripting language. The key insight is that you don’t need a super fast compiled language to manage user input that happens at a low frequency.
At Pixar, we actually rewrote our in-house animation toolset this way: as a Python-based main application that calls down into an extensive set of very efficient Model and View C++ APIs. This gave us all the advantages listed here, such as removing the compile-link phase for many application logic changes while still delivering an interactive 3D animation system for our artists.

• Support for expert users. Adding a scripting language to an end-user application can allow advanced users to customize the functionality of the application by writing macros to perform repetitive tasks or tasks that are not exposed through the GUI. This can be done without sacrificing the usability of the software for novice users, who will interface with the application solely through the GUI.

• Extensibility. In addition to giving expert users access to under-the-covers functionality, a scripting interface can be used to let them add entirely new functionality to the application through plugin interfaces. This means that the developer of the application is no longer responsible for solving every user’s problem. Instead, users have the power to solve their own problems. For example, the Firefox Web browser allows new extensions to be created using JavaScript as its embedded scripting language.

• Scripting for testability. One extremely valuable side effect of being able to write code in a scripting language is that you can write automated tests using that language. This is an advantage because you can enable your QA team to write automated tests too rather than rely solely on black-box testing. Often (although not exclusively), QA engineers will not write C++ code. However, there are many skilled white-box QA engineers who can write scripting language code. Involving your QA team in writing automated tests can let them contribute at a lower level and provide greater testing coverage.

• Expressiveness. The field of linguistics defines the principle of linguistic relativity (also known as the Sapir–Whorf hypothesis) as the idea that people’s thoughts and behavior are influenced by their language. When applied to the field of computer science, this can mean that the expressiveness, flexibility, and ease of use of a programming language could impact the kinds of solutions that you can envision. That’s because you don’t have to be distracted by low-level issues such as memory management or statically typed data representations. This is obviously a more qualitative and subjective point than the previous technical arguments, but it is no less valid or significant.

11.1.3 Language Compatibility Issues

One important issue to be aware of when exposing a C++ API in a scripting language is that the patterns and idioms of C++ will not map directly to those of the scripting language. As such, a direct translation of the C++ API into the scripting language may produce a script module that doesn’t feel natural or native in that language. For example,

• Naming conventions. C++ functions are often written using either upper or lower camel case, that is, GetName() or getName(). However, the Python convention (defined in PEP 8) is to use snake case for method names, for example, get_name(). Similarly, Ruby specifies that method names should use snake case also.

• Getters/setters. In this book, I have advocated that you should never directly expose data members in your classes. Instead, you should always provide getter/setter methods to access those members. However, many script languages allow you to use the syntax for accessing a member variable while still forcing that the access goes through getter/setter methods. In fact, in Ruby, this is the only way that you can access member variables from outside of a class. The result is that instead of C++ style code such as

 object.SetName(“Hello”);

 std::string name = object.GetName();

you can simply write the following, which still involves the use of underlying getter/setter methods:

 object.name = “Hello”

 name = object.name

• Iterators. Most scripting languages support the general concept of iteratators to navigate through the elements in a sequence. However, the implementation of this concept will not normally harmonize with the STL implementation. For example, C++ has five categories of iterators (forward, bidirectional, random access, input, and output), whereas Python has a single iterator category (forward). Making a C++ object iteratable in a scripting language therefore requires specific attention to adapt it to the semantics of that language, such as adding an __iter__() method in the case of Python.

• Operators. You already know that C++ supports several operators, such as operator+, operator+=, and operator[]. Often these can be translated directly into the equivalent syntax of the scripting language, such as exposing C++’s stream operator<< as the to_s() method in Ruby (which returns a string representation of the object). However, the target language may support additional operators that are not supported by C++, such as Ruby’s power operator (**) and its operator to return the quotient and modules of a division (divmod).

• Containers. STL provides container classes such as std::vector, std::set, and std::map. These are statically typed class templates that can only contain objects of the same type. By comparison, many scripting languages are dynamically typed and support containers with elements of different types. It’s much more common to use these flexible types to pass data around in scripting languages. For example, a C++ method that accepts several non-const reference arguments might be better represented in a scripting language by a method that returns a tuple. For example,

 float width, height;

 GetDimensions(&width, &height); // C++

 width, height = get_dimensions(); # Python

All of this means that creating a good script binding is often a process that requires a degree of manual tuning. Technologies that attempt to create bindings fully automatically will normally produce APIs that don’t feel natural in the scripting language. For example, the PyObjC utility provides a bridge for Objective-C objects in Python, but can result in cumbersome constructs in Python, such as methods called setValue_(). In contrast, a technology that lets you manually craft the way that functions are exposed in script will let you produce a higher quality result.

11.1.4 Crossing the Language Barrier

The language barrier refers to the boundary where C++ meets the scripting language. Script bindings for an object will take care of forwarding method calls in the scripting language down into the relevant C++ code. However, having C++ code call up into the scripting language will not normally happen by default. This is because a C++ API that has not been specifically designed to interoperate with a scripting language will not know that it’s running in a script environment.

For example, consider a C++ class with a virtual method that gets overridden in Python. The C++ code has no idea that Python has overridden one of its virtual methods. This makes sense because the C++ vtable is created statically at compile time and cannot adapt to Python’s dynamic ability to add methods at run time. Some binding technologies provide extra functionality to make this cross-language polymorphism work. I will discuss how this is done for Boost Python and SWIG later in the chapter.

Another issue to be aware of is whether the C++ code uses an internal event or notification system. If this is the case, some extra mechanism will need to be put in place to forward any C++-triggered events across the language boundary into script code. For example, Qt and Boost offer a signal/slot system where C++ code can register to receive notifications when another C++ object changes state. However, allowing scripts to receive these events will require you to write explicit code that can intercept the C++ events and send them over the boundary to the script object.

Finally, exceptions are another case where C++ code may need to communicate with script code. For example, uncaught C++ exceptions must be caught at the language barrier and then be translated into the native exception type of the script language.

11.2 Script-binding Technologies

Various technologies can be used to generate the wrappers that allow a scripting language to call down into your C++ code. Each offers its own specific advantages and disadvantages. Some are language-neutral technologies that support many scripting languages (such as COM or CORBA), some are specific to C/C++ but provide support for creating bindings to many languages (such as SWIG), some provide C++ bindings for a single language (such as Boost Python), whereas others focus on C++ bindings for a specific API (such as the Pivy Python bindings for the Open Inventor C++ toolkit).

I will list several of these technologies here, and then in the remainder of the chapter I will focus on two in more detail. I have chosen to focus on portable yet C++-specific solutions rather than considering the more general and heavyweight interprocess communication models such as COM or CORBA. To provide greater utility, I will look at one binding technology that lets you define the script binding programmatically (Boost Python) and another that uses an interface definition file to generate code for the binding (SWIG).

Any script binding technology is essentially founded upon the Adapter design pattern, that is, it provides a one-to-one mapping of one API to another API while also translating data types into their most appropriate native form and perhaps using more idiomatic naming conventions. Recognizing this fact means that you should also be aware of the standard issues that face API wrapping design patterns such as Proxy and Adapter. Of principal concern is the need to keep the two APIs synchronized over time. As you will see, both Boost Python and SWIG require you to keep redundant files in sync as you evolve the C++ API, such as extra C++ files in the case of Boost and separate interface files in the case of SWIG. This often turns out to be the largest maintenance cost when supporting a scripting API.

11.2.3. Python-SIP

SIP is a tool that lets you create C and C++ bindings for Python. It was originally created for the PyQt package, which provides Python bindings for Nokia’s Qt toolkit. As such, Python-SIP has specific support for the Qt signal/slot mechanism. However, the tool can also be used to create bindings for any C++ API.

SIP works in a very similar fashion to SWIG, although it does not support the range of languages that SWIG does. SIP supports much of the C/C++ syntax for its interface specification files and uses a similar syntax for its commands as SWIG (i.e., tokens that start with a % symbol), although it supports a different set and style of commands to customize the binding. Here is an example of a simple Python-SIP interface specification file.

 // Define the SIP wrapper for an example library.

 // define the Python module name and generation number

 %Module example 0

 class Example {

 // include example.h in the wrapper that SIP generates

 %TypeHeaderCode

 #include <example.h>

 %End

public:

 Example(const char *name);

 char *GetName() const;

 };

11.2.4 COM Automation

Component Object Model (COM) is a binary interface standard that allows objects to interact with each other via interprocess communication. COM objects specify well-defined interfaces that allow software components to be reused and linked together to build end-user applications. The technology was developed by Microsoft in 1993 and is still used today, predominantly on the Windows platform, although Microsoft now encourages the use of .NET and SOAP.

COM encompasses a large suite of technologies, but the part I will focus on here is COM Automation, also known as OLE Automation or simply Automation. This involves Automation objects (also known as ActiveX objects) being accessed from scripting languages to perform repetitive tasks or to control an application from script. A large number of target languages are supported, such as Visual Basic, JScript, Perl, Python, Ruby, and the range of Microsoft .NET languages.

A COM object is identified by a Universally Unique ID (UUID) and exposes its functionality via interfaces that are also identified by UUIDs. All COM objects support the IUnknown interface methods of AddRef(), Release(), and QueryInterface(). COM Automation objects additionally implement the IDispatch interface, which includes the Invoke() method to trigger a named function in the object.

The object model for the interface being exposed is described using an interface description language (IDL). IDL is a language-neutral description of a software component’s interface, normally stored in a file with an .idl extension. This IDL description can then be translated into various forms using the MIDL.EXE compiler on Windows. The generated files include the proxy DLL code for the COM object and a type library that describes the object model. The following sample shows an example of the Microsoft IDL syntax:

 // Example.idl

 import “mydefs.h”,“unknown.idl”;

[

 object, uuid(d1420a03-d0ec-11b1-c04f-008c3ac31d2f),

 ]

 interface ISomething : IUnknown

{

 HRESULT MethodA([in] short Param1, [out] BKFST *pParam2);

 HRESULT MethodB([in, out] BKFST *pParam1);

 };

[

 object, uuid(1e1423d1-ba0c-d110-043a-00cf8cc31d2f),

 pointer_default(unique)

 ]

 interface ISomethingElse : IUnknown

{

HRESULT MethodC([in] long Max,

 [in, max_is(Max)] Param1[],

 [out] long *pSize,

 [out, size_is(, *pSize)] BKFST **ppParam2);

 };

There is also a framework called Cross-Platform Component Object Model, or XPCOM. This is an open source project developed by Mozilla and used in a number of their applications, including the Firefox browser. XPCOM follows a very similar design to COM, although their components are not compatible or interchangeable.

11.3 Adding Python Bindings With Boost Python

The rest of this chapter is dedicated to giving you a concrete understanding of how to create script bindings for your C++ API. I begin by showing you how to create Python bindings using the Boost Python libraries.

Python is an open source dynamically typed language designed by Guido van Rossum and first appeared in 1991. Python is strongly typed and features automatic memory management and reference counting. It comes with a large and extensive standard library, including modules such as os, sys, re, difflib, codecs, datetime, math, gzip, csv, socket, json, and xml, among many others. One of the more unusual aspects of Python is the fact that indentation is used to define scope, as opposed to curly braces in C and C++. The original CPython implementation of Python is the most common, but there are other major implementations, such as Jython (written in Java) and Iron Python (targeting the .NET framework). For more details on the Python language, refer to http://www.python.org/.

As already noted, Boost Python is used to define Python bindings programmatically, which can then be compiled to a dynamic library that can be loaded directly by Python. Figure 11.1 illustrates this basic workflow.

image

Figure 11.1 The workflow for creating Python bindings of a C++ API using Boost Python. White boxes represent files; shaded boxes represent commands.

11.3.1 Building Boost Python

Many Boost packages are implemented solely as headers, using templates and inline functions, so you only need to make sure that you add the Boost directory to your compiler’s include search path. However, using Boost Python requires that you build and link against the boost_python library, so you need to know how to build Boost.

The recommended way to build Boost libraries is to use the bjam utility, a descendant of the Perforce Jam build system. So first you will need to download bjam. Prebuilt executables are available for most platforms from http://www.boost.org/.

Building the boost libraries on a UNIX variant, such as Linux or Mac, involves the following steps:

 % cd <boost-root-directory>

 % ./bootstrap.sh - -prefix=<install-dir>

 % ./bjam toolset=<toolset> install

The <toolset> string is used to define the compiler that you wish to build under, for example, “gcc,” “darwin,” “msvc,” or “intel.”

If you have multiple versions of Python installed on your machine, you can specify which version to use via bjam’s configuration file, a file called user-config.bjam that you should create in your home directory. You can find out more details about configuring bjam in the Boost.Build manual, but essentially you will want to add something like the following entries to your user-config.bjam file:

using python

 : 2.6                 # version

 : /usr/local/bin/python2.6    # executable Path

 : /usr/local/include/python2.6 # include path

 : /usr/local/lib/python2.6    # lib path

On Windows, you can perform similar steps from the command prompt (just run bootstrap.bat instead of boostrap.sh). However, you can also simply download prebuilt boost libraries from http://www.boostpro.com/. If you use the prebuilt libraries, you will need to make sure that you compile your script bindings using the same version of Python used to compile the BoostPro libraries.

11.3.2 Wrapping a C++ API with Boost Python

Let’s start by presenting a simple C++ API, which I will then expose to Python. I’ll use the example of a phone book that lets you store phone numbers for multiple contacts. This gives us a manageable, yet non-trival, example to build upon throughout the chapter. Here’s the public definition of our phonebook API.

 // phonebook.h

 #include <string>

 class Person

 {

public:

 Person();

 explicit Person(const std::string &name);

 void SetName(const std::string &name);

 std::string GetName() const;

 void SetHomeNumber(const std::string &number);

 std::string GetHomeNumber() const;

 };

 class PhoneBook

 {

public:

 int GetSize() const;

 void AddPerson(Person *p);

 void RemovePerson(const std::string &name);

 Person *FindPerson(const std::string &name);

 };

Note that this will let us demonstrate a number of capabilities, such as wrapping multiple classes, the use of STL containers, and multiple constructors. I will also take the opportunity to demonstrate the creation of Python properties in addition to the direct mapping of C++ member functions to Python methods.

The Person class is essentially just a data container: it only contains getter/setter methods that access underlying data members. These are good candidates for translating to Python properties. A property in Python behaves like a normal object but uses getter/setter methods to manage access to that object (as well as a deleter method for destroying the object). This makes for more intuitive access to class members that you want to behave like a simple data member while also letting you provide logic that controls getting and setting the value.

Now that I have presented our C++ API, let’s look at how you can specify Python bindings for it using boost::python. You will normally create a separate .cpp file to specify the bindings for a given module, where it’s conventional to use the same base filename as the module with a _wrap suffix appended, that is, I will use phonebook_wrap.cpp for our example. This wrap file is where you specify the classes that you want to expose and the methods that you want to be available on those classes. The following file presents the boost::python code necessary to wrap our phonebook.h API.

 // phonebook_wrap.cpp

 #include “phonebook.h”

 #include <boost/python.hpp>

 using namespace boost::python;

 BOOST_PYTHON_MODULE(phonebook)

{

class_<Person>(“Person”)

 .add_property(“name”, &Person::GetName, &Person::SetName)

.add_property(“home_number”, &Person::GetHomeNumber,

 &Person::SetHomeNumber)

 ;

class_<PhoneBook>(“PhoneBook”)

 .def(“size”, &PhoneBook::GetSize)

 .def(“add_person”, &PhoneBook::AddPerson)

 .def(“remove_person”, &PhoneBook::RemovePerson)

.def(“find_person”, &PhoneBook::FindPerson,

 return_value_policy<reference_existing_object>())

 ;

 }

Note that for the Person class, I defined two properties, called name and home_number, and I provided the C++ getter/setter functions to control access to those properties (if I only provided a getter method then the property would be read only). For the PhoneBook class, I defined standard methods, called size(), add_person(), remove_person(), and find_person(), respectively. I also had to specify explicitly how I want the pointer return value of find_person() to behave.

You can then compile the code for phonebook.cpp and phonebook_wrap.cpp to a dynamic library. This will involve compiling against the headers for Python and boost::python as well as linking against the libraries for both. The result should be a phonebook.so library on Mac and Linux or phonebook.dll on Windows. (Note: Python doesn’t recognize the .dylib extension on the Mac.) For example, on Linux:

 g++ -c phonebook.cpp

 g++ -c phonebook_wrap.cpp -I<boost_includes> -I<python_include>

 g++ -shared -o phonebook.so phonebook.o phonebook_wrap.o -lboost_python -lpython

At this point, you can directly load the dynamic library into Python using its import keyword. Here is some sample Python code that loads our C++ library and demonstrates the creation of Person and Phonebook objects. Note the property syntax for accessing the Person object, for example, p.name, versus the method call syntax for the PhoneBook members, for example, book.add_person().

 #!/usr/bin/python

 import phonebook

 # create the phonebook

 book = phonebook.PhoneBook()

 # add one contact

 p = phonebook.Person()

 p.name = ‘Martin’

 p.home_number = ‘(123) 456-7890’

 book.add_person(p)

 # add another contact

 p = phonebook.Person()

 p.name = ‘Genevieve’

 p.home_number = ‘(123) 456-7890’

 book.add_person(p)

 # display number of contacts added (2)

 print “No. of contacts =”, book.size()

11.3.4 Extending the Python API

It’s also possible to add new methods to the Python API that don’t exist in the C++ API. This is used most commonly to define some of the standard Python object methods, such as __str__() to return a human-readable version of the object or __eq__() to test for equality.

In the following example, I have updated the phonebook_wrap.cpp file to include a static free function that prints out the values of a Person object. I then use this function to define the Person.__str__() method in Python.

 // phonebook_wrap.cpp

 #include “phonebook.h”

 #include <boost/python.hpp>

 #include <sstream>

 #include <iostream>

 using namespace boost::python;

 static std::string PrintPerson(const Person &p)

{

 std::ostringstream stream;

 stream << p.GetName() << “: ” << p.GetHomeNumber();

 return stream.str();

 }

 BOOST_PYTHON_MODULE(phonebook)

{

class_<Person>(“Person”, init<>())

 .def(init<std::string>())

 .add_property(“name”, &Person::GetName, &Person::SetName)

.add_property(“home_number”, &Person::GetHomeNumber,

 &Person::SetHomeNumber)

 .def(“__str__”, &PrintPerson)

 ;

 ….

 }

This demonstrates the general ability to add new methods to a class. However, in this particular case, Boost Python provides an alternative way to specify the __str__() function in a more idiomatic fashion. You could define operator<< for Person and tell Boost to use this operator for the __str__() method. For example,

 #include “phonebook.h”

 #include <boost/python.hpp>

 #include <iostream>

 using namespace boost::python;

 std::ostream &operator<<(std::ostream &os, const Person &p)

{

 os << p.GetName() << “: ” << p.GetHomeNumber();

 return os;

 }

 BOOST_PYTHON_MODULE(phonebook)

{

class_<Person>(“Person”, init<>())

 .def(init<std::string>())

 .add_property(“name”, &Person::GetName, &Person::SetName)

.add_property(“home_number”, &Person::GetHomeNumber,

 &Person::SetHomeNumber)

 .def(self_ns::str(self))

 ;

 }

With this definition for Person.__str__() you can now write code like the following (entered at the interactive Python interpreter prompt, >>>):

 >>> import phonebook

 >>> p = phonebook.Person(‘Martin’)

 >>> print p

 Martin:

 >>> p.home_number = ‘(123) 456-7890’

 >>> print p

 Martin: (123) 456-7890

While I am talking about extending the Python API, I will note that the dynamic nature of Python means that you can actually add new methods to a class at run time. This is not a Boost Python feature, but a core capability of the Python language itself. For example, you could define the __str__() method at the Python level, as follows:

 #!/usr/bin/python

 import phonebook

def person_str(self):

 return “Name: %s\nHome: %s” % (self.name, self.home_number)

 # override the __str__ method for the Person class

 phonebook.Person.__str__ = person_str

 p = phonebook.Person()

 p.name = ‘Martin’

 p.home_number = ‘(123) 456-7890’

 print p

This will output the following text to the shell.

 Name: Martin

 Home: (123) 456-7890

11.3.5 Inheritance in C++

Both C++ and Python support multiple inheritance, and Boost Python makes it easy to expose all of the base classes of any C++ class. I’ll show how this is done by turning the Person class into a base class (i.e., provide a virtual destructor) and adding a derived class called PersonWithCell, which adds the ability to specify a cell phone number. This is not a particularly good design choice, but it serves our purposes for this example.

 // phonebook.h

 #include <string>

 class Person

 {

public:

 Person();

 explicit Person(const std::string &name);

 virtual ~Person();

 void SetName(const std::string &name);

 std::string GetName() const;

 void SetHomeNumber(const std::string &number);

 std::string GetHomeNumber() const;

 };

 class PersonWithCell : public Person

 {

public:

 PersonWithCell();

 explicit PersonWithCell(const std::string &name);

 void SetCellNumber(const std::string &number);

 std::string GetCellNumber() const;

 };

You can then represent this inheritance hierarchy in Python by updating the wrap file as follows:

 BOOST_PYTHON_MODULE(phonebook)

{

class_<Person>(“Person”, init<>())

 .def(init<std::string>())

 .add_property(“name”, &Person::GetName, &Person::SetName)

.add_property(“home_number”, &Person::GetHomeNumber,

 &Person::SetHomeNumber)

 ;

class_<PersonWithCell, bases<Person> >(“PersonWithCell”)

 .def(init<std::string >())

.add_property(“cell_number”, &PersonWithCell::GetCellNumber,

 &PersonWithCell::SetCellNumber)

 ;

 …

 }

Now you can create PersonWithCell objects from Python as follows:

 #!/usr/bin/python

 import phonebook

 book = phonebook.PhoneBook()

 p = phonebook.Person()

 p.name = ‘Martin’

 p.home_number = ‘(123) 456-7890’

 book.add_person(p)

 p = phonebook.PersonWithCell()

 p.name = ‘Genevieve’

 p.home_number = ‘(123) 456-7890’

 p.cell_number = ‘(123) 097-2134’

 book.add_person(p)

11.3.6 Cross-Language Polymorphism

You can create classes in Python that derive from C++ classes that you’ve exposed with Boost Python. For example, the following Python program shows how you could create the PersonWithCell class directly in Python and still be able to add instances of this class to PhoneBook.

 #!/usr/bin/python

 import phonebook

 book = phonebook.PhoneBook()

class PyPersonWithCell(phonebook.Person):

def get_cell_number(self):

 return self.cell

def set_cell_number(self, n):

 self.cell = n

 cell_number = property(get_cell_number, set_cell_number)

 p = PyPersonWithCell()

 p.name = ‘Martin’

 p.home_number = ‘(123) 456-7890’

 p.cell_number = ‘(123) 097-2134’

 book.add_person(p)

Of course, the cell_number property on PyPersonWithCell will only be callable from Python. C++ will have no idea that a new method has been dynamically added to an inherited class.

It’s also important to note that even C++ virtual functions that are overridden in Python will not be callable from C++ by default. However, Boost Python does provide a way to do this if cross-language polymorphism is important for your API. This is done by defining a wrapper class that multiply inherits from the C++ class being bound as well as Boost Python’s wrapper class template. This wrapper class can then check to see if an override has been defined in Python for a given virtual function and then call that method if it is defined. For example, given a C++ class called Base with a virtual method, you can create the wrapper class as follows:

 class Base

 {

public:

 virtual ~Base();

 virtual int f();

 };

 class BaseWrap : Base, wrapper<Base>

 {

public:

 int f()

{

 // check for an override in Python

if (override f = this->get_override(“f”))

 return f();

 // or call the C++ implementation

 return Base::f();

 }

 int default_f()

{

 return this->Base::f();

 }

 };

Then you can expose the Base class as follows:

class_<BaseWrap, boost::non-copyable>(“Base”)

 .def(“f”, &Base::f, &BaseWrap::default_f)

 ;

11.3.7 Supporting Iterators

Boost Python also lets you create Python iterators based on STL iterator interfaces that you define in your C++ API. This lets you create objects in Python that behave more “Pythonically” in terms of iterating through the elements in a container. For example, you can add begin() and end() methods to the PhoneBook class that provide access to STL iterators for traversing through all of the contacts in the phone book.

 class PhoneBook

 {

public:

 int GetSize() const;

 void AddPerson(Person *p);

 void RemovePerson(const std::string &name);

 Person *FindPerson(const std::string &name);

 typedef std::vector<Person *> PersonList;

 PersonList::iterator begin();

 PersonList::iterator end();

 };

With these additional methods, you can extend the wrapping for the PhoneBook class to specify the __iter__() method, which is the Python way for an object to return an iterator.

 BOOST_PYTHON_MODULE(phonebook)

{

 …

class_<PhoneBook>(“PhoneBook”)

 .def(“size”, &PhoneBook::GetSize)

 .def(“add_person”, &PhoneBook::AddPerson)

 .def(“remove_person”, &PhoneBook::RemovePerson)

.def(“find_person”, &PhoneBook::FindPerson,

 return_value_policy<reference_existing_object>())

 .def(“__iter__”, range(&PhoneBook::begin, &PhoneBook::end));

 ;

 }

Now, you can write Python code that iterates through all of the contacts in a PhoneBook object as follows:

 #!/usr/bin/python

 import phonebook

 book = phonebook.PhoneBook()

 book.add_person(phonebook.Person())

 book.add_person(phonebook.Person())

for person in book:

 print person

11.3.8 Putting It All Together

Combining all of the features that I’ve introduced in the preceding sections, here is the final definition of the phonebook.h header and the phonebook_wrap.cpp boost::python wrapper.

 // phonebook.h

 #include <string>

 #include <vector>

 class Person

 {

public:

 Person();

 explicit Person(const std::string &name);

 virtual ~Person();

 void SetName(const std::string &name);

 std::string GetName() const;

 void SetHomeNumber(const std::string &number);

 std::string GetHomeNumber() const;

 };

 class PersonWithCell : public Person

 {

public:

 PersonWithCell();

 explicit PersonWithCell(const std::string &name);

 void SetCellNumber(const std::string &number);

 std::string GetCellNumber() const;

 };

 class PhoneBook

 {

public:

 int GetSize() const;

 void AddPerson(Person *p);

 void RemovePerson(const std::string &name);

 Person *FindPerson(const std::string &name);

 typedef std::vector<Person *> PersonList;

 PersonList::iterator begin() { return mList.begin(); }

 PersonList::iterator end() { return mList.end(); }

 };

 // phonebook_wrap.cpp

 #include “phonebook.h”

 #include <boost/python.hpp>

 #include <sstream>

 #include <iostream>

 using namespace boost::python;

 std::ostream &operator<<(std::ostream &os, const Person &p)

{

 os << p.GetName() << “: ” << p.GetHomeNumber();

 return os;

 }

 static std::string PrintPersonWithCell(const PersonWithCell *p)

{

 std::ostringstream stream;

 stream << “Name: ” << p->GetName() << “, Home: ”;

 stream << p->GetHomeNumber() << “, Cell: ”;

 stream << p->GetCellNumber();

 return stream.str();

 }

 BOOST_PYTHON_MODULE(phonebook)

{

class_<Person>(“Person”, init<>())

 .def(init<std::string>())

 .add_property(“name”, &Person::GetName, &Person::SetName)

.add_property(“home_number”, &Person::GetHomeNumber,

 &Person::SetHomeNumber)

 .def(self_ns::str(self))

 ;

class_<PersonWithCell,

 bases<Person> >(“PersonWithCell”)

 .def(init<std::string>())

.add_property(“cell_number”, &PersonWithCell::GetCellNumber,

 &PersonWithCell::SetCellNumber)

 .def(“__str__”, &PrintPersonWithCell)

 ;

class_<PhoneBook>(“PhoneBook”)

 .def(“size”, &PhoneBook::GetSize)

 .def(“add_person”, &PhoneBook::AddPerson)

 .def(“remove_person”, &PhoneBook::RemovePerson)

.def(“find_person”, &PhoneBook::FindPerson,

 return_value_policy<reference_existing_object>())

 .def(“__iter__”, range(&PhoneBook::begin, &PhoneBook::end));

 ;

 }

11.4 Adding Ruby Bindings With Swig

The following sections will look at another example of creating script bindings for C++ APIs. In this case I will use the Simplified Wrapper and Interface Generator and I will use this utility to create bindings for the Ruby language.

Ruby is an open source dynamically typed scripting language that was released by Yukihiro “Matz” Matsumoto in 1995. Ruby was influenced by languages such as Perl and Smalltalk with an emphasis on ease of use. In Ruby, everything is an object, even types that C++ treats separately as built-in primitives such as int, float, and bool. Ruby is an extremely popular scripting language and is often cited as being more popular than Python in Japan, where it was originally developed. For more information on the Ruby language, see http://www.ruby-lang.org/.

SWIG works by reading the binding definition within an interface file and generating C++ code to specify the bindings. This generated code can then be compiled to a dynamic library that can be loaded directly by Ruby. Figure 11.2 illustrates this basic workflow. Note that SWIG supports many scripting languages. I will use it to create Ruby bindings, but it could just as easily be used to create Python bindings, Perl bindings, or bindings for several other languages.

image

Figure 11.2 The workflow for creating Ruby bindings of a C++ API using SWIG. White boxes represent files; shaded boxes represent commands.

11.4.1 Wrapping a C++ API with SWIG

I’ll start with the same phone book API from the Python example and then show how to create Ruby bindings for this interface using SWIG. The phone book C++ header looks like

 // phonebook.h

 #include <string>

 class Person

 {

public:

 Person();

 explicit Person(const std::string &name);

 void SetName(const std::string &name);

 std::string GetName() const;

 void SetHomeNumber(const std::string &number);

 std::string GetHomeNumber() const;

 };

 class PhoneBook

 {

public:

 bool IsEmpty() const;

 int GetSize() const;

 void AddPerson(Person *p);

 void RemovePerson(const std::string &name);

 Person *FindPerson(const std::string &name);

 };

Let’s take a look at a basic SWIG interface file to specify how you want to expose this C++ API to Ruby.

 // phonebook.i

 %module phonebook

 %{

 // we need the API header to compile the bindings

 #include “phonebook.h”

 %}

 // pull in the built-in SWIG STL wrappings (note the ‘%’)

 %include “std_string.i”

 %include “std_vector.i”

 class Person

 {

public:

 Person();

 explicit Person(const std::string &name);

 void SetName(const std::string &name);

 std::string GetName() const;

 void SetHomeNumber(const std::string &number);

 std::string GetHomeNumber() const;

 };

 class PhoneBook

 {

public:

 bool IsEmpty() const;

 int GetSize() const;

 void AddPerson(Person *p);

 void RemovePerson(const std::string &name);

 Person *FindPerson(const std::string &name);

 };

You can see that the interface file looks very similar to the phonebook.h header file. In fact, SWIG can parse most C++ syntax directly. If your C++ header is very simple, you can even use SWIG’s %include directive to simply tell it to read the C++ header file directly. I’ve chosen not to do this so that you have direct control over what you do and do not expose to Ruby.

Now that you have an initial interface file, you can ask SWIG to read this file and generate Ruby bindings for all the specified C++ classes and methods. This will create a phonebook_wrap.cxx file, which you can compile together with the C++ code to create a dynamic library. For example, the steps on Linux are

 swig -c++ -ruby phonebook.i # creates phonebook_wrap.cxx

 g++ -c phonebook_wrap.cxx -I<ruby-include-path>

 g++ -c phonebook.cpp

 g++ -shared -o phonebook.so phonebook_wrap.o phonebook.o -L<ruby-lib-path> -lruby

11.4.2 Tuning the Ruby API

This first attempt at a Ruby binding is rather rudimentary. There are several issues that you will want to address to make the API feel more natural to Ruby programmers. First off, the naming convention for Ruby methods is to use snake case instead of camel case, that is, add_person() instead of AddPerson(). SWIG supports this by letting you rename symbols in the scripting API using its %rename command. For example, you can add the following lines to the interface file to tell SWIG to rename the methods of the PhoneBook class.

 %rename(“size”) PhoneBook::GetSize;

 %rename(“add_person”) PhoneBook::AddPerson;

 %rename(“remove_person”) PhoneBook::RemovePerson;

 %rename(“find_person”) PhoneBook::FindPerson;

Recent versions of SWIG actually support an -autorename command line option to perform this function renaming automatically. It is expected that this option will eventually be turned on by default.

Second, Ruby has a concept similar to Python’s properties to provide convenient access to data members. In fact, rather elegantly, all instance variables in Ruby are private and must therefore be accessed via getter/setter methods. The %rename syntax can be used to accomplish this ability too. For example,

 %rename(“name”) Person::GetName;

 %rename(“name=”) Person::SetName;

 %rename(“home_number”) Person::GetHomeNumber;

 %rename(“home_number=”) Person::SetHomeNumber;

Finally, you may have noticed that I added an extra IsEmpty() method to the PhoneBook C++ class. This method simply returns true if no contacts have been added to the phone book. I’ve added this because it lets me demonstrate how to expose a C++ member function as a Ruby query method. This is a method that returns a boolean return value and by convention it ends with a question mark. I would therefore like the IsEmpty() C++ function to appear as empty? in Ruby. This can be done using either SWIG’s %predicate or %rename directives.

 %rename(“empty?”) PhoneBook::IsEmpty;

With these amendments to our interface file, our Ruby API is starting to feel more native. If you rerun SWIG on the interface file and rebuild the phonebook dynamic library, you can import it directly into Ruby and write code such as

 #!/usr/bin/ruby

 require ‘phonebook’

 book = Phonebook::PhoneBook.new

 p = Phonebook::Person.new

 p.name = ‘Martin’

 p.home_number = ‘(123) 456-7890’

 book.add_person(p)

 p = Phonebook::Person.new

 p.name = ‘Genevieve’

 p.home_number = ‘(123) 456-7890’

 book.add_person(p)

 puts “No. of contacts = #{book.size}”

Note the use of the p.name getter and p.name= setter, as well as the snake case add_person() method name.

11.4.5 Inheritance in C++

As with the constructor case just given, there’s nothing special that you have to do to represent inheritance using SWIG. You simply declare the class in the interface file using the standard C++ syntax. For example, you can add the following PersonWithCell class to our API:

 // phonebook.h

 #include <string>

 class Person

 {

public:

 Person();

 explicit Person(const std::string &name);

 virtual ~Person();

 void SetName(const std::string &name);

 std::string GetName() const;

 void SetHomeNumber(const std::string &number);

 std::string GetHomeNumber() const;

 };

 class PersonWithCell : public Person

 {

public:

 PersonWithCell();

 explicit PersonWithCell(const std::string &name);

 void SetCellNumber(const std::string &number);

 std::string GetCellNumber() const;

 };

 …

Then you can update the SWIG interface file as follows:

 // phonebook.i

 …

 %rename(“name”) Person::GetName;

 %rename(“name=”) Person::SetName;

 %rename(“home_number”) Person::GetHomeNumber;

 %rename(“home_number=”) Person::SetHomeNumber;

 class Person

 {

public:

 Person();

 explicit Person(const std::string &name);

 virtual ~Person();

 void SetName(const std::string &name);

 std::string GetName() const;

 void SetHomeNumber(const std::string &number);

 std::string GetHomeNumber() const;

 …

 };

 %rename(“cell_number”) PersonWithCell::GetCellNumber;

 %rename(“cell_number=”) PersonWithCell::SetCellNumber;

 class PersonWithCell : public Person

 {

public:

 PersonWithCell();

 explicit PersonWithCell(const std::string &name);

 void SetCellNumber(const std::string &number);

 std::string GetCellNumber() const;

 };

 …

You can then access this derived C++ class from Ruby as follows:

 #!/usr/bin/ruby

 require ‘phonebook’

 p = Phonebook::Person.new

 p.name = ‘Martin’

 p.home_number = ‘(123) 456-7890’

 p = Phonebook::PersonWithCell.new

 p.name = ‘Genevieve’

 p.home_number = ‘(123) 456-7890’

 p.cell_number = ‘(123) 097-2134’

Ruby supports only single inheritance, with support for additional mixin classes. C++ of course supports multiple inheritance. Therefore, by default, SWIG will only consider the first base class listed in the derived class: member functions in any other base classes will not be inherited. However, recent versions of SWIG support an optional -minherit command line option that will attempt to simulate multiple inheritance using Ruby mixins (although in this case a class no longer has a true base class in Ruby).

11.4.7 Putting It All Together

I have evolved our simple example through several iterations in order to add each incremental enhancement. So I will finish off this section by presenting the entire C++ header and SWIG interface file for your reference. First of all, here is the C++ API:

 // phonebook.h

 #include <string>

 class Person

 {

public:

 Person();

 explicit Person(const std::string &name);

 virtual ~Person();

 void SetName(const std::string &name);

 std::string GetName() const;

 void SetHomeNumber(const std::string &number);

 std::string GetHomeNumber() const;

 };

 class PersonWithCell : public Person

 {

public:

 PersonWithCell();

 explicit PersonWithCell(const std::string &name);

 void SetCellNumber(const std::string &number);

 std::string GetCellNumber() const;

 };

 class PhoneBook

 {

public:

 bool IsEmpty() const;

 int GetSize() const;

 void AddPerson(Person *p);

 void RemovePerson(const std::string &name);

 Person *FindPerson(const std::string &name);

 };

and here is the final SWIG interface (.i) file:

 %module(directors=“1”) phonebook

 %{

 #include “phonebook.h”

 #include <sstream>

 #include <iostream>

 %}

 %feature(“director”);

 %include “std_string.i”

 %include “std_vector.i”

 %rename(“name”) Person::GetName;

 %rename(“name=”) Person::SetName;

 %rename(“home_number”) Person::GetHomeNumber;

 %rename(“home_number=”) Person::SetHomeNumber;

 class Person

 {

public:

 Person();

 explicit Person(const std::string &name);

 virtual ~Person();

 void SetName(const std::string &name);

 std::string GetName() const;

 void SetHomeNumber(const std::string &number);

 std::string GetHomeNumber() const;

%extend {

std::string to_s() {

 std::ostringstream stream; stream << self->GetName() << “: ”; stream << self->GetHomeNumber(); return stream.str();

 }

 }

 };

 %rename(“cell_number”) PersonWithCell::GetCellNumber;

 %rename(“cell_number=”) PersonWithCell::SetCellNumber;

 class PersonWithCell : public Person

 {

public:

 PersonWithCell();

 explicit PersonWithCell(const std::string &name);

 void SetCellNumber(const std::string &number);

 std::string GetCellNumber() const;

 };

 %rename(“empty?”) PhoneBook::IsEmpty;

 %rename(“size”) PhoneBook::GetSize;

 %rename(“add_person”) PhoneBook::AddPerson;

 %rename(“remove_person”) PhoneBook::RemovePerson;

 %rename(“find_person”) PhoneBook::FindPerson;

 class PhoneBook

 {

public:

 bool IsEmpty() const;

 int GetSize() const;

 void AddPerson(Person *p);

 void RemovePerson(const std::string &name);

 Person *FindPerson(const std::string &name);

 };