Scripting
Up until this chapter, I have focused on general aspects of API design that could be applicable to any C++ project. Having covered the standard API design pipeline, the remaining chapters in this book deal with the more specialized topics of scripting and extensibility. While not all APIs need to be concerned with these topics, they are becoming more popular subjects in modern application development. I therefore felt that a comprehensive book on C++ API design should include coverage of these advanced topics.
Accordingly, this chapter deals with the topic of scripting, that is, allowing your C++ API to be accessed from a scripting language, such as Python, Ruby, Lua, Tcl, or Perl. I will explain why you might want to do this and what some of the issues are that you need to be aware of, and then review some of the main technologies that let you create bindings for these languages.
To make this chapter more practical and instructive, I will take a detailed look at two different script binding technologies and show how these can be used to create bindings for two different scripting languages. Specifically, I will provide an in-depth treatment of how to create Python bindings for your C++ API using Boost Python, followed by a thorough analysis of how to create Ruby bindings using the Simplified Wrapper and Interface Generator (SWIG). I have chosen to focus on Python and Ruby because these are two of the most popular scripting languages in use today; in terms of binding technologies, Boost Python and SWIG are both freely available open source solutions that provide extensive control over the resulting bindings.
A script binding provides a way to access a C++ API from a scripting language. This normally involves creating wrapper code for the C++ classes and functions that allow them to be imported into the scripting language using its native module loading features, for example, the import keyword in Python, require in Ruby, or use in Perl.
There are two main strategies for integrating C++ with a scripting language:
1. Extending the language. In this model, a script binding is provided as a module that supplements the functionality of the scripting language. That is, users who write code with the scripting language can use your module in their own scripts. Your module will look just like any other module for that language. For example, the expat and md5 modules in the Python standard library are implemented in C, not Python.
2. Embedding within an application. In this case, an end-user C++ application embeds a scripting language inside of it. Script bindings are then used to let end users write scripts for that specific application that call down into the core functionality of the program. Examples of this include the Autodesk Maya 3D modeling system, which offers Python and Maya Embedded Language (MEL) scripting, and the Adobe Director multimedia authoring platform, which embeds the Lingo scripting language.
Whichever strategy applies to your situation, the procedure to define and build script bindings is the same in each case. The only thing that really changes is who owns the C++ main() function.
The decision to provide access to native code APIs from within a scripting language offers many advantages. These advantages can either apply directly to you, if you provide a supported script binding for your C++ API, or to your clients, who may create their own script bindings on top of your C++-only API. I enumerate a few of these benefits here.
• Cross-platform. Scripting languages are interpreted, meaning that they execute plain ASCII source code or platform-independent byte code. They will also normally provide their own modules to interface with platform-specific features, such as the file system. Writing code for a scripting language should therefore work on multiple platforms without modification. This can also be considered a disadvantage for proprietary projects because scripting code will normally have to be distributed in source form.
• Faster development. If you make a change to a C++ program, you have to compile and link your code again. For large systems, this can be a time-consuming operation and can fracture engineer productivity as they wait to be able to test their change. In a scripting language, you simply edit the source code and then run it: there is no compile and link stage. This allows you to prototype and test new changes quickly, often resulting in greater engineer efficiency and project velocity.
• Write less code. A given problem can normally be solved with less code in a higher-level scripting language versus C++. Scripting languages don’t require explicit memory management, they tend to have a much larger standard library available than C++’s STL, and they often take care of complex concepts such as reference counting behind the scenes. For example, the following single line of Ruby code will take a string and return an alphabetized list of all the unique letters in that string. This would take a lot more code to implement in C++.
“Hello World”.downcase.split(“”).uniq.sort.join
• Script-based applications. The traditional view of scripting languages is that you use them for small command-line tasks, but you must write an end-user application in a language such as C++ for maximum efficiency. However, an alternative view is that you can write the core performance-critical routines in C++, create script bindings for them, and then write your application in a scripting language. In Model–View–Controller parlance, the Model and potentially also the View are written in C++, whereas the Controller is implemented using a scripting language. The key insight is that you don’t need a super fast compiled language to manage user input that happens at a low frequency.
At Pixar, we actually rewrote our in-house animation toolset this way: as a Python-based main application that calls down into an extensive set of very efficient Model and View C++ APIs. This gave us all the advantages listed here, such as removing the compile-link phase for many application logic changes while still delivering an interactive 3D animation system for our artists.
• Support for expert users. Adding a scripting language to an end-user application can allow advanced users to customize the functionality of the application by writing macros to perform repetitive tasks or tasks that are not exposed through the GUI. This can be done without sacrificing the usability of the software for novice users, who will interface with the application solely through the GUI.
• Extensibility. In addition to giving expert users access to under-the-covers functionality, a scripting interface can be used to let them add entirely new functionality to the application through plugin interfaces. This means that the developer of the application is no longer responsible for solving every user’s problem. Instead, users have the power to solve their own problems. For example, the Firefox Web browser allows new extensions to be created using JavaScript as its embedded scripting language.
• Scripting for testability. One extremely valuable side effect of being able to write code in a scripting language is that you can write automated tests using that language. This is an advantage because you can enable your QA team to write automated tests too rather than rely solely on black-box testing. Often (although not exclusively), QA engineers will not write C++ code. However, there are many skilled white-box QA engineers who can write scripting language code. Involving your QA team in writing automated tests can let them contribute at a lower level and provide greater testing coverage.
• Expressiveness. The field of linguistics defines the principle of linguistic relativity (also known as the Sapir–Whorf hypothesis) as the idea that people’s thoughts and behavior are influenced by their language. When applied to the field of computer science, this can mean that the expressiveness, flexibility, and ease of use of a programming language could impact the kinds of solutions that you can envision. That’s because you don’t have to be distracted by low-level issues such as memory management or statically typed data representations. This is obviously a more qualitative and subjective point than the previous technical arguments, but it is no less valid or significant.
One important issue to be aware of when exposing a C++ API in a scripting language is that the patterns and idioms of C++ will not map directly to those of the scripting language. As such, a direct translation of the C++ API into the scripting language may produce a script module that doesn’t feel natural or native in that language. For example,
• Naming conventions. C++ functions are often written using either upper or lower camel case, that is, GetName() or getName(). However, the Python convention (defined in PEP 8) is to use snake case for method names, for example, get_name(). Similarly, Ruby specifies that method names should use snake case also.
• Getters/setters. In this book, I have advocated that you should never directly expose data members in your classes. Instead, you should always provide getter/setter methods to access those members. However, many script languages allow you to use the syntax for accessing a member variable while still forcing that the access goes through getter/setter methods. In fact, in Ruby, this is the only way that you can access member variables from outside of a class. The result is that instead of C++ style code such as
std::string name = object.GetName();
you can simply write the following, which still involves the use of underlying getter/setter methods:
• Iterators. Most scripting languages support the general concept of iteratators to navigate through the elements in a sequence. However, the implementation of this concept will not normally harmonize with the STL implementation. For example, C++ has five categories of iterators (forward, bidirectional, random access, input, and output), whereas Python has a single iterator category (forward). Making a C++ object iteratable in a scripting language therefore requires specific attention to adapt it to the semantics of that language, such as adding an __iter__() method in the case of Python.
• Operators. You already know that C++ supports several operators, such as operator+, operator+=, and operator[]. Often these can be translated directly into the equivalent syntax of the scripting language, such as exposing C++’s stream operator<< as the to_s() method in Ruby (which returns a string representation of the object). However, the target language may support additional operators that are not supported by C++, such as Ruby’s power operator (**) and its operator to return the quotient and modules of a division (divmod).
• Containers. STL provides container classes such as std::vector, std::set, and std::map. These are statically typed class templates that can only contain objects of the same type. By comparison, many scripting languages are dynamically typed and support containers with elements of different types. It’s much more common to use these flexible types to pass data around in scripting languages. For example, a C++ method that accepts several non-const reference arguments might be better represented in a scripting language by a method that returns a tuple. For example,
All of this means that creating a good script binding is often a process that requires a degree of manual tuning. Technologies that attempt to create bindings fully automatically will normally produce APIs that don’t feel natural in the scripting language. For example, the PyObjC utility provides a bridge for Objective-C objects in Python, but can result in cumbersome constructs in Python, such as methods called setValue_(). In contrast, a technology that lets you manually craft the way that functions are exposed in script will let you produce a higher quality result.
The language barrier refers to the boundary where C++ meets the scripting language. Script bindings for an object will take care of forwarding method calls in the scripting language down into the relevant C++ code. However, having C++ code call up into the scripting language will not normally happen by default. This is because a C++ API that has not been specifically designed to interoperate with a scripting language will not know that it’s running in a script environment.
For example, consider a C++ class with a virtual method that gets overridden in Python. The C++ code has no idea that Python has overridden one of its virtual methods. This makes sense because the C++ vtable is created statically at compile time and cannot adapt to Python’s dynamic ability to add methods at run time. Some binding technologies provide extra functionality to make this cross-language polymorphism work. I will discuss how this is done for Boost Python and SWIG later in the chapter.
Another issue to be aware of is whether the C++ code uses an internal event or notification system. If this is the case, some extra mechanism will need to be put in place to forward any C++-triggered events across the language boundary into script code. For example, Qt and Boost offer a signal/slot system where C++ code can register to receive notifications when another C++ object changes state. However, allowing scripts to receive these events will require you to write explicit code that can intercept the C++ events and send them over the boundary to the script object.
Finally, exceptions are another case where C++ code may need to communicate with script code. For example, uncaught C++ exceptions must be caught at the language barrier and then be translated into the native exception type of the script language.
Various technologies can be used to generate the wrappers that allow a scripting language to call down into your C++ code. Each offers its own specific advantages and disadvantages. Some are language-neutral technologies that support many scripting languages (such as COM or CORBA), some are specific to C/C++ but provide support for creating bindings to many languages (such as SWIG), some provide C++ bindings for a single language (such as Boost Python), whereas others focus on C++ bindings for a specific API (such as the Pivy Python bindings for the Open Inventor C++ toolkit).
I will list several of these technologies here, and then in the remainder of the chapter I will focus on two in more detail. I have chosen to focus on portable yet C++-specific solutions rather than considering the more general and heavyweight interprocess communication models such as COM or CORBA. To provide greater utility, I will look at one binding technology that lets you define the script binding programmatically (Boost Python) and another that uses an interface definition file to generate code for the binding (SWIG).
Any script binding technology is essentially founded upon the Adapter design pattern, that is, it provides a one-to-one mapping of one API to another API while also translating data types into their most appropriate native form and perhaps using more idiomatic naming conventions. Recognizing this fact means that you should also be aware of the standard issues that face API wrapping design patterns such as Proxy and Adapter. Of principal concern is the need to keep the two APIs synchronized over time. As you will see, both Boost Python and SWIG require you to keep redundant files in sync as you evolve the C++ API, such as extra C++ files in the case of Boost and separate interface files in the case of SWIG. This often turns out to be the largest maintenance cost when supporting a scripting API.
Boost Python (also written as boost::python or Boost.Python) is a C++ library that lets C++ APIs interoperate with Python. It is part of the excellent Boost libraries, available from http://www.boost.org/. With Boost Python you can create bindings programmatically in C++ code and then link the bindings against the Python and Boost Python libraries. This produces a dynamic library that can be imported directly into Python.
Boost Python includes support for the following capabilities and features in terms of wrapping C++ APIs:
SWIG is an open source utility that can be used to create bindings for C or C++ interfaces in a variety of high-level languages. The supported languages include scripting languages such as Perl, PHP, Python, Tcl, and Ruby, as well as non-scripting languages such as C#, Common Lisp, Java, Lua, Modula-3, OCAML, Octave, and R.
The central design concept of SWIG is the interface file, normally given a .i file extension. This file is used to specify the generic bindings for a given module using C/C++ as the syntax to define the bindings. The general format of a SWIG interface file is as follows:
// declarations needed to compile the generated C++ binding code
// declarations for the classes, functions, etc. to be wrapped
The SWIG program can then read this interface file and generate bindings for a specific language. These bindings are then compiled to a shared library that can be loaded by the scripting language. For more information about SWIG, see http://www.swig.org/.
SIP is a tool that lets you create C and C++ bindings for Python. It was originally created for the PyQt package, which provides Python bindings for Nokia’s Qt toolkit. As such, Python-SIP has specific support for the Qt signal/slot mechanism. However, the tool can also be used to create bindings for any C++ API.
SIP works in a very similar fashion to SWIG, although it does not support the range of languages that SWIG does. SIP supports much of the C/C++ syntax for its interface specification files and uses a similar syntax for its commands as SWIG (i.e., tokens that start with a % symbol), although it supports a different set and style of commands to customize the binding. Here is an example of a simple Python-SIP interface specification file.
Component Object Model (COM) is a binary interface standard that allows objects to interact with each other via interprocess communication. COM objects specify well-defined interfaces that allow software components to be reused and linked together to build end-user applications. The technology was developed by Microsoft in 1993 and is still used today, predominantly on the Windows platform, although Microsoft now encourages the use of .NET and SOAP.
COM encompasses a large suite of technologies, but the part I will focus on here is COM Automation, also known as OLE Automation or simply Automation. This involves Automation objects (also known as ActiveX objects) being accessed from scripting languages to perform repetitive tasks or to control an application from script. A large number of target languages are supported, such as Visual Basic, JScript, Perl, Python, Ruby, and the range of Microsoft .NET languages.
A COM object is identified by a Universally Unique ID (UUID) and exposes its functionality via interfaces that are also identified by UUIDs. All COM objects support the IUnknown interface methods of AddRef(), Release(), and QueryInterface(). COM Automation objects additionally implement the IDispatch interface, which includes the Invoke() method to trigger a named function in the object.
The object model for the interface being exposed is described using an interface description language (IDL). IDL is a language-neutral description of a software component’s interface, normally stored in a file with an .idl extension. This IDL description can then be translated into various forms using the MIDL.EXE compiler on Windows. The generated files include the proxy DLL code for the COM object and a type library that describes the object model. The following sample shows an example of the Microsoft IDL syntax:
import “mydefs.h”,“unknown.idl”;
object, uuid(d1420a03-d0ec-11b1-c04f-008c3ac31d2f),
interface ISomething : IUnknown
HRESULT MethodA([in] short Param1, [out] BKFST *pParam2);
HRESULT MethodB([in, out] BKFST *pParam1);
object, uuid(1e1423d1-ba0c-d110-043a-00cf8cc31d2f),
interface ISomethingElse : IUnknown
HRESULT MethodC([in] long Max,
There is also a framework called Cross-Platform Component Object Model, or XPCOM. This is an open source project developed by Mozilla and used in a number of their applications, including the Firefox browser. XPCOM follows a very similar design to COM, although their components are not compatible or interchangeable.
The Common Object Request Broker Architecture (CORBA) is an industry standard to allow software components to communicate with each other independent of their location and vendor. In this regard, it is very similar to COM: both technologies solve the problem of communication between objects from different sources and both make use of a language-neutral IDL format to describe each object’s interface.
CORBA is cross-platform with several open source implementations and provides strong support for UNIX platforms. It was defined by the Object Management Group in 1991 (the same group that manages the UML modeling language). CORBA offers a wide range of language bindings, including Python, Perl, Ruby, Smalltalk, JavaScript, Tcl, and the CORBA Scripting Language (IDLscript). It also supports interfaces with multiple inheritance versus COM’s single inheritance.
In terms of scripting, CORBA doesn’t require a specific automation interface as COM does. All CORBA objects are scriptable by default via the Dynamic Invocation Interface, which lets scripting languages determine the object’s interface dynamically. As an example of accessing CORBA objects from a scripting language, here is a simple IDL description and how it maps to the Ruby language.
The rest of this chapter is dedicated to giving you a concrete understanding of how to create script bindings for your C++ API. I begin by showing you how to create Python bindings using the Boost Python libraries.
Python is an open source dynamically typed language designed by Guido van Rossum and first appeared in 1991. Python is strongly typed and features automatic memory management and reference counting. It comes with a large and extensive standard library, including modules such as os, sys, re, difflib, codecs, datetime, math, gzip, csv, socket, json, and xml, among many others. One of the more unusual aspects of Python is the fact that indentation is used to define scope, as opposed to curly braces in C and C++. The original CPython implementation of Python is the most common, but there are other major implementations, such as Jython (written in Java) and Iron Python (targeting the .NET framework). For more details on the Python language, refer to http://www.python.org/.
As already noted, Boost Python is used to define Python bindings programmatically, which can then be compiled to a dynamic library that can be loaded directly by Python. Figure 11.1 illustrates this basic workflow.
Figure 11.1 The workflow for creating Python bindings of a C++ API using Boost Python. White boxes represent files; shaded boxes represent commands.
Many Boost packages are implemented solely as headers, using templates and inline functions, so you only need to make sure that you add the Boost directory to your compiler’s include search path. However, using Boost Python requires that you build and link against the boost_python library, so you need to know how to build Boost.
The recommended way to build Boost libraries is to use the bjam utility, a descendant of the Perforce Jam build system. So first you will need to download bjam. Prebuilt executables are available for most platforms from http://www.boost.org/.
Building the boost libraries on a UNIX variant, such as Linux or Mac, involves the following steps:
The <toolset> string is used to define the compiler that you wish to build under, for example, “gcc,” “darwin,” “msvc,” or “intel.”
If you have multiple versions of Python installed on your machine, you can specify which version to use via bjam’s configuration file, a file called user-config.bjam that you should create in your home directory. You can find out more details about configuring bjam in the Boost.Build manual, but essentially you will want to add something like the following entries to your user-config.bjam file:
On Windows, you can perform similar steps from the command prompt (just run bootstrap.bat instead of boostrap.sh). However, you can also simply download prebuilt boost libraries from http://www.boostpro.com/. If you use the prebuilt libraries, you will need to make sure that you compile your script bindings using the same version of Python used to compile the BoostPro libraries.
Let’s start by presenting a simple C++ API, which I will then expose to Python. I’ll use the example of a phone book that lets you store phone numbers for multiple contacts. This gives us a manageable, yet non-trival, example to build upon throughout the chapter. Here’s the public definition of our phonebook API.
explicit Person(const std::string &name);
void SetName(const std::string &name);
void SetHomeNumber(const std::string &number);
std::string GetHomeNumber() const;
void RemovePerson(const std::string &name);
Note that this will let us demonstrate a number of capabilities, such as wrapping multiple classes, the use of STL containers, and multiple constructors. I will also take the opportunity to demonstrate the creation of Python properties in addition to the direct mapping of C++ member functions to Python methods.
The Person class is essentially just a data container: it only contains getter/setter methods that access underlying data members. These are good candidates for translating to Python properties. A property in Python behaves like a normal object but uses getter/setter methods to manage access to that object (as well as a deleter method for destroying the object). This makes for more intuitive access to class members that you want to behave like a simple data member while also letting you provide logic that controls getting and setting the value.
Now that I have presented our C++ API, let’s look at how you can specify Python bindings for it using boost::python. You will normally create a separate .cpp file to specify the bindings for a given module, where it’s conventional to use the same base filename as the module with a _wrap suffix appended, that is, I will use phonebook_wrap.cpp for our example. This wrap file is where you specify the classes that you want to expose and the methods that you want to be available on those classes. The following file presents the boost::python code necessary to wrap our phonebook.h API.
using namespace boost::python;
BOOST_PYTHON_MODULE(phonebook)
.add_property(“name”, &Person::GetName, &Person::SetName)
.add_property(“home_number”, &Person::GetHomeNumber,
&Person::SetHomeNumber)class_<PhoneBook>(“PhoneBook”)
.def(“size”, &PhoneBook::GetSize)
.def(“add_person”, &PhoneBook::AddPerson)
.def(“remove_person”, &PhoneBook::RemovePerson)
.def(“find_person”, &PhoneBook::FindPerson,
return_value_policy<reference_existing_object>())Note that for the Person class, I defined two properties, called name and home_number, and I provided the C++ getter/setter functions to control access to those properties (if I only provided a getter method then the property would be read only). For the PhoneBook class, I defined standard methods, called size(), add_person(), remove_person(), and find_person(), respectively. I also had to specify explicitly how I want the pointer return value of find_person() to behave.
You can then compile the code for phonebook.cpp and phonebook_wrap.cpp to a dynamic library. This will involve compiling against the headers for Python and boost::python as well as linking against the libraries for both. The result should be a phonebook.so library on Mac and Linux or phonebook.dll on Windows. (Note: Python doesn’t recognize the .dylib extension on the Mac.) For example, on Linux:
g++ -c phonebook_wrap.cpp -I<boost_includes> -I<python_include>
g++ -shared -o phonebook.so phonebook.o phonebook_wrap.o -lboost_python -lpython
At this point, you can directly load the dynamic library into Python using its import keyword. Here is some sample Python code that loads our C++ library and demonstrates the creation of Person and Phonebook objects. Note the property syntax for accessing the Person object, for example, p.name, versus the method call syntax for the PhoneBook members, for example, book.add_person().
In our phonebook_wrap.cpp file, I didn’t specify constructors for the Person or PhoneBook classes explicitly. In this case, Boost Python will expose the default constructor for each class, which is why I was able to write:
However, note that in the C++ API, the Person class has two constructors, a default constructor and a second constructor that accepts a string parameter:
You can tell Boost Python to expose both of these constructors by updating the wrapping code for Person to specify a single constructor in the class definition and then list further constructors using the .def() syntax.
class_<Person>(“Person”, init<>())
.add_property(“name”, &Person::GetName, &Person::SetName)
Now you can create Person objects from Python using either constructor.
It’s also possible to add new methods to the Python API that don’t exist in the C++ API. This is used most commonly to define some of the standard Python object methods, such as __str__() to return a human-readable version of the object or __eq__() to test for equality.
In the following example, I have updated the phonebook_wrap.cpp file to include a static free function that prints out the values of a Person object. I then use this function to define the Person.__str__() method in Python.
using namespace boost::python;
static std::string PrintPerson(const Person &p)
stream << p.GetName() << “: ” << p.GetHomeNumber();
BOOST_PYTHON_MODULE(phonebook)
class_<Person>(“Person”, init<>())
.add_property(“name”, &Person::GetName, &Person::SetName)
.add_property(“home_number”, &Person::GetHomeNumber,
&Person::SetHomeNumber)This demonstrates the general ability to add new methods to a class. However, in this particular case, Boost Python provides an alternative way to specify the __str__() function in a more idiomatic fashion. You could define operator<< for Person and tell Boost to use this operator for the __str__() method. For example,
using namespace boost::python;
std::ostream &operator<<(std::ostream &os, const Person &p)
os << p.GetName() << “: ” << p.GetHomeNumber();
BOOST_PYTHON_MODULE(phonebook)
class_<Person>(“Person”, init<>())
.add_property(“name”, &Person::GetName, &Person::SetName)
.add_property(“home_number”, &Person::GetHomeNumber,
&Person::SetHomeNumber)With this definition for Person.__str__() you can now write code like the following (entered at the interactive Python interpreter prompt, >>>):
While I am talking about extending the Python API, I will note that the dynamic nature of Python means that you can actually add new methods to a class at run time. This is not a Boost Python feature, but a core capability of the Python language itself. For example, you could define the __str__() method at the Python level, as follows:
return “Name: %s\nHome: %s” % (self.name, self.home_number)
# override the __str__ method for the Person class
phonebook.Person.__str__ = person_str
This will output the following text to the shell.
Both C++ and Python support multiple inheritance, and Boost Python makes it easy to expose all of the base classes of any C++ class. I’ll show how this is done by turning the Person class into a base class (i.e., provide a virtual destructor) and adding a derived class called PersonWithCell, which adds the ability to specify a cell phone number. This is not a particularly good design choice, but it serves our purposes for this example.
explicit Person(const std::string &name);
void SetName(const std::string &name);
void SetHomeNumber(const std::string &number);
std::string GetHomeNumber() const;
class PersonWithCell : public Person
explicit PersonWithCell(const std::string &name);
void SetCellNumber(const std::string &number);
You can then represent this inheritance hierarchy in Python by updating the wrap file as follows:
BOOST_PYTHON_MODULE(phonebook)
class_<Person>(“Person”, init<>())
.add_property(“name”, &Person::GetName, &Person::SetName)
.add_property(“home_number”, &Person::GetHomeNumber,
&Person::SetHomeNumber)class_<PersonWithCell, bases<Person> >(“PersonWithCell”)
.add_property(“cell_number”, &PersonWithCell::GetCellNumber,
&PersonWithCell::SetCellNumber)Now you can create PersonWithCell objects from Python as follows:
You can create classes in Python that derive from C++ classes that you’ve exposed with Boost Python. For example, the following Python program shows how you could create the PersonWithCell class directly in Python and still be able to add instances of this class to PhoneBook.
class PyPersonWithCell(phonebook.Person):
cell_number = property(get_cell_number, set_cell_number)
p.home_number = ‘(123) 456-7890’
Of course, the cell_number property on PyPersonWithCell will only be callable from Python. C++ will have no idea that a new method has been dynamically added to an inherited class.
It’s also important to note that even C++ virtual functions that are overridden in Python will not be callable from C++ by default. However, Boost Python does provide a way to do this if cross-language polymorphism is important for your API. This is done by defining a wrapper class that multiply inherits from the C++ class being bound as well as Boost Python’s wrapper class template. This wrapper class can then check to see if an override has been defined in Python for a given virtual function and then call that method if it is defined. For example, given a C++ class called Base with a virtual method, you can create the wrapper class as follows:
class BaseWrap : Base, wrapper<Base>
// check for an override in Python
if (override f = this->get_override(“f”))
return f();Then you can expose the Base class as follows:
Boost Python also lets you create Python iterators based on STL iterator interfaces that you define in your C++ API. This lets you create objects in Python that behave more “Pythonically” in terms of iterating through the elements in a container. For example, you can add begin() and end() methods to the PhoneBook class that provide access to STL iterators for traversing through all of the contacts in the phone book.
void RemovePerson(const std::string &name);
Person *FindPerson(const std::string &name);
With these additional methods, you can extend the wrapping for the PhoneBook class to specify the __iter__() method, which is the Python way for an object to return an iterator.
BOOST_PYTHON_MODULE(phonebook)
class_<PhoneBook>(“PhoneBook”)
.def(“size”, &PhoneBook::GetSize)
.def(“add_person”, &PhoneBook::AddPerson)
.def(“remove_person”, &PhoneBook::RemovePerson)
.def(“find_person”, &PhoneBook::FindPerson,
return_value_policy<reference_existing_object>()).def(“__iter__”, range(&PhoneBook::begin, &PhoneBook::end));
Now, you can write Python code that iterates through all of the contacts in a PhoneBook object as follows:
Combining all of the features that I’ve introduced in the preceding sections, here is the final definition of the phonebook.h header and the phonebook_wrap.cpp boost::python wrapper.
explicit Person(const std::string &name);
void SetName(const std::string &name);
void SetHomeNumber(const std::string &number);
std::string GetHomeNumber() const;
class PersonWithCell : public Person
explicit PersonWithCell(const std::string &name);
void SetCellNumber(const std::string &number);
std::string GetCellNumber() const;
void RemovePerson(const std::string &name);
Person *FindPerson(const std::string &name);
typedef std::vector<Person *> PersonList;
PersonList::iterator begin() { return mList.begin(); }
PersonList::iterator end() { return mList.end(); }
using namespace boost::python;
std::ostream &operator<<(std::ostream &os, const Person &p)
os << p.GetName() << “: ” << p.GetHomeNumber();
static std::string PrintPersonWithCell(const PersonWithCell *p)
stream << “Name: ” << p->GetName() << “, Home: ”;
stream << p->GetHomeNumber() << “, Cell: ”;
BOOST_PYTHON_MODULE(phonebook)
class_<Person>(“Person”, init<>())
.add_property(“name”, &Person::GetName, &Person::SetName)
.add_property(“home_number”, &Person::GetHomeNumber,
&Person::SetHomeNumber)bases<Person> >(“PersonWithCell”)
.add_property(“cell_number”, &PersonWithCell::GetCellNumber,
&PersonWithCell::SetCellNumber).def(“__str__”, &PrintPersonWithCell)
class_<PhoneBook>(“PhoneBook”)
.def(“size”, &PhoneBook::GetSize)
.def(“add_person”, &PhoneBook::AddPerson)
.def(“remove_person”, &PhoneBook::RemovePerson)
.def(“find_person”, &PhoneBook::FindPerson,
return_value_policy<reference_existing_object>()).def(“__iter__”, range(&PhoneBook::begin, &PhoneBook::end));
The following sections will look at another example of creating script bindings for C++ APIs. In this case I will use the Simplified Wrapper and Interface Generator and I will use this utility to create bindings for the Ruby language.
Ruby is an open source dynamically typed scripting language that was released by Yukihiro “Matz” Matsumoto in 1995. Ruby was influenced by languages such as Perl and Smalltalk with an emphasis on ease of use. In Ruby, everything is an object, even types that C++ treats separately as built-in primitives such as int, float, and bool. Ruby is an extremely popular scripting language and is often cited as being more popular than Python in Japan, where it was originally developed. For more information on the Ruby language, see http://www.ruby-lang.org/.
SWIG works by reading the binding definition within an interface file and generating C++ code to specify the bindings. This generated code can then be compiled to a dynamic library that can be loaded directly by Ruby. Figure 11.2 illustrates this basic workflow. Note that SWIG supports many scripting languages. I will use it to create Ruby bindings, but it could just as easily be used to create Python bindings, Perl bindings, or bindings for several other languages.
Figure 11.2 The workflow for creating Ruby bindings of a C++ API using SWIG. White boxes represent files; shaded boxes represent commands.
I’ll start with the same phone book API from the Python example and then show how to create Ruby bindings for this interface using SWIG. The phone book C++ header looks like
explicit Person(const std::string &name);
void SetName(const std::string &name);
void SetHomeNumber(const std::string &number);
std::string GetHomeNumber() const;
void RemovePerson(const std::string &name);
Let’s take a look at a basic SWIG interface file to specify how you want to expose this C++ API to Ruby.
// we need the API header to compile the bindings
// pull in the built-in SWIG STL wrappings (note the ‘%’)
explicit Person(const std::string &name);
void SetName(const std::string &name);
void SetHomeNumber(const std::string &number);
std::string GetHomeNumber() const;
void RemovePerson(const std::string &name);
You can see that the interface file looks very similar to the phonebook.h header file. In fact, SWIG can parse most C++ syntax directly. If your C++ header is very simple, you can even use SWIG’s %include directive to simply tell it to read the C++ header file directly. I’ve chosen not to do this so that you have direct control over what you do and do not expose to Ruby.
Now that you have an initial interface file, you can ask SWIG to read this file and generate Ruby bindings for all the specified C++ classes and methods. This will create a phonebook_wrap.cxx file, which you can compile together with the C++ code to create a dynamic library. For example, the steps on Linux are
This first attempt at a Ruby binding is rather rudimentary. There are several issues that you will want to address to make the API feel more natural to Ruby programmers. First off, the naming convention for Ruby methods is to use snake case instead of camel case, that is, add_person() instead of AddPerson(). SWIG supports this by letting you rename symbols in the scripting API using its %rename command. For example, you can add the following lines to the interface file to tell SWIG to rename the methods of the PhoneBook class.
%rename(“size”) PhoneBook::GetSize;
%rename(“add_person”) PhoneBook::AddPerson;
Recent versions of SWIG actually support an -autorename command line option to perform this function renaming automatically. It is expected that this option will eventually be turned on by default.
Second, Ruby has a concept similar to Python’s properties to provide convenient access to data members. In fact, rather elegantly, all instance variables in Ruby are private and must therefore be accessed via getter/setter methods. The %rename syntax can be used to accomplish this ability too. For example,
%rename(“name”) Person::GetName;
%rename(“name=”) Person::SetName;
Finally, you may have noticed that I added an extra IsEmpty() method to the PhoneBook C++ class. This method simply returns true if no contacts have been added to the phone book. I’ve added this because it lets me demonstrate how to expose a C++ member function as a Ruby query method. This is a method that returns a boolean return value and by convention it ends with a question mark. I would therefore like the IsEmpty() C++ function to appear as empty? in Ruby. This can be done using either SWIG’s %predicate or %rename directives.
With these amendments to our interface file, our Ruby API is starting to feel more native. If you rerun SWIG on the interface file and rebuild the phonebook dynamic library, you can import it directly into Ruby and write code such as
Note the use of the p.name getter and p.name= setter, as well as the snake case add_person() method name.
Our Person class has two constructors: a default constructor that takes no parameters and a non-default constructor that takes a std::string name. Using SWIG, you simply have to include those constructor declarations in the interface file and it will automatically create the relevant constructors in Ruby. That is, given the earlier interface file, you can already do:
In general, method overloading is not quite as flexible in Ruby as it is in C++. For example, SWIG will not be able to disambiguate between overloaded functions that map to the same types in Ruby, for example, a constructor that takes a short and another that takes an int or a constructor that takes a pointer to an object and another that takes a reference to the same type. SWIG does provide a way to deal with this by letting you ignore a given overloaded method (using %ignore) or renaming one of the methods (using %rename).
SWIG lets you extend the functionality of your C++ API, for example, to add new methods to a class that will only appear in the Ruby API. This is done using SWIG’s %extend directive. I will demonstrate this by adding a to_s() method to the Ruby version of our Person class. This is a standard Ruby method used to return a human-readable representation of an object, equivalent to Python’s __str__() method.
explicit Person(const std::string &name);
void SetName(const std::string &name);
void SetHomeNumber(const std::string &number);
std::string GetHomeNumber() const;
std::ostringstream stream; stream << self->GetName() << “: ”; stream << self->GetHomeNumber(); return stream.str();Using this new definition for our Person binding, you can write the following Ruby code:
The puts p line will print out the Person object using our to_s() method. In this case, this results in the following output:
As with the constructor case just given, there’s nothing special that you have to do to represent inheritance using SWIG. You simply declare the class in the interface file using the standard C++ syntax. For example, you can add the following PersonWithCell class to our API:
explicit Person(const std::string &name);
void SetName(const std::string &name);
void SetHomeNumber(const std::string &number);
std::string GetHomeNumber() const;
class PersonWithCell : public Person
explicit PersonWithCell(const std::string &name);
void SetCellNumber(const std::string &number);
Then you can update the SWIG interface file as follows:
%rename(“name”) Person::GetName;
%rename(“name=”) Person::SetName;
%rename(“home_number”) Person::GetHomeNumber;
%rename(“home_number=”) Person::SetHomeNumber;
explicit Person(const std::string &name);
void SetName(const std::string &name);
void SetHomeNumber(const std::string &number);
std::string GetHomeNumber() const;
%rename(“cell_number”) PersonWithCell::GetCellNumber;
%rename(“cell_number=”) PersonWithCell::SetCellNumber;
class PersonWithCell : public Person
explicit PersonWithCell(const std::string &name);
void SetCellNumber(const std::string &number);
You can then access this derived C++ class from Ruby as follows:
Ruby supports only single inheritance, with support for additional mixin classes. C++ of course supports multiple inheritance. Therefore, by default, SWIG will only consider the first base class listed in the derived class: member functions in any other base classes will not be inherited. However, recent versions of SWIG support an optional -minherit command line option that will attempt to simulate multiple inheritance using Ruby mixins (although in this case a class no longer has a true base class in Ruby).
By default, if you override a virtual function in Ruby you will not be able to call the Ruby method from C++. However, SWIG gives you a way to enable this kind of cross-language polymorphism via its “directors” feature. When you enable directors for a class, SWIG generates a new wrapper class that derives from the C++ class as well as SWIG’s director class. The director class stores a pointer to the underlying Ruby object and works out whether a function call should be directed to an overridden Ruby method or the default C++ implementation. This is analogous to the way that Boost Python supports cross-language polymorphism. However, SWIG creates the wrapper class for you behind the scenes: all you have to do is specify which classes you want to create directors for and then enable the directors feature in your %module directive. For example, the following update to our interface file will turn on cross-language polymorphism for all our classes:
I have evolved our simple example through several iterations in order to add each incremental enhancement. So I will finish off this section by presenting the entire C++ header and SWIG interface file for your reference. First of all, here is the C++ API:
explicit Person(const std::string &name);
void SetName(const std::string &name);
void SetHomeNumber(const std::string &number);
std::string GetHomeNumber() const;
class PersonWithCell : public Person
explicit PersonWithCell(const std::string &name);
void SetCellNumber(const std::string &number);
std::string GetCellNumber() const;
void RemovePerson(const std::string &name);
and here is the final SWIG interface (.i) file:
%module(directors=“1”) phonebook
%rename(“name”) Person::GetName;
%rename(“name=”) Person::SetName;
%rename(“home_number”) Person::GetHomeNumber;
%rename(“home_number=”) Person::SetHomeNumber;
explicit Person(const std::string &name);
void SetName(const std::string &name);
void SetHomeNumber(const std::string &number);
std::string GetHomeNumber() const;
std::ostringstream stream; stream << self->GetName() << “: ”; stream << self->GetHomeNumber(); return stream.str();%rename(“cell_number”) PersonWithCell::GetCellNumber;
%rename(“cell_number=”) PersonWithCell::SetCellNumber;
class PersonWithCell : public Person
explicit PersonWithCell(const std::string &name);
void SetCellNumber(const std::string &number);
std::string GetCellNumber() const;
%rename(“empty?”) PhoneBook::IsEmpty;
%rename(“size”) PhoneBook::GetSize;
%rename(“add_person”) PhoneBook::AddPerson;
%rename(“remove_person”) PhoneBook::RemovePerson;
%rename(“find_person”) PhoneBook::FindPerson;
void RemovePerson(const std::string &name);