At this point, all of our classes and functions have been consolidated to a central location, and all related include
statements have been removed. We would prefer to start writing tests for our classes, but it is very likely that we have a lot of global
variables embedded in them. These can cause a lot of trouble via action at a distance where modifying a global
in one place changes its value in another place. The next step, then, is to remove all uses of the global
keyword from our classes, and inject the necessary dependencies instead.
What Is Dependency Injection?
Dependency injection means that we push our dependencies into a class from the outside, instead of pulling them into a class while inside the class. (U sing global
pulls a variable into the current scope from the global scope, so it is the opposite of injection.) Dependency injection turns out to be very straightforward as a concept, but is sometimes difficult to adhere to as a discipline.
To start with a naive example, let's say an Example
class needs a database connection. Here we create the connection inside a class method:
classes/Example.php
1 <?php
2 class Example
3 {
4 public function fetch()
5 {
6 $db = new Db('hostname', 'username', 'password');
7 return $db->query(...);
8 }
9 }
10 ?>
We are creating the Db
dependency inside the method that needs it. There are several problems with this. Some of them are:
After writing code like this, many developers discover the global
keyword, and realize they can create the connection once in a setup file, then pull it in from the global scope:
setup.php
1 <?php
2 // some setup code, then:
3 $db = new Db('hostname', 'username', 'password');
4 ?>
classes/Example.php
1 <?php
2 class Example
3 {
4 public function fetch()
5 {
6 global $db;
7 return $db->query(...);
8 }
9 }
10 ?>
Even though we are still pulling in the dependency, this technique solves the problem of multiple database connections using up limited resources, since the same database connection is reused across the codebase. The technique also makes it possible to change our connection parameters in a single location, the setup.php
file, instead of several locations. However, one problem remains, and one is added:
$db
variable is ever changed by any of the calling code, that change is reflected throughout the codebase, leading to debugging trouble.The last point is a killer. If a method ever sets $db = 'busted';
then the $db
value is now a string, and not a database connection object, throughout the entire codebase. Likewise, if the $db
object is modified, then it is modified for the entire codebase. This can lead to very difficult debugging sessions.
Thus, we want to remove all global
calls from the codebase to make it easier to troubleshoot, and to reveal the dependencies in our classes. Here is the general process we will use to replace global
calls with dependency injection:
global
variable in one of our classes.global
variables in that class to the constructor and retain their values as properties, and use the properties instead of the globals.global
calls in the constructor to constructor parameters.global
call in our class files, until none remain.This is easy with a project-wide search function. We search for global
within the central class directory location, and get back a list of class files with that keyword in them.
Let's say that our search revealed an Example
class with code something like the following:
classes/Example.php
1 <?php
2 class Example
3 {
4 public function fetch()
5 {
6 global $db;
7 return $db->query(...);
8 }
9 }
10 ?>
We now move the global variable to a property that gets set in the constructor, and convert the fetch()
method to use the property:
classes/Example.php
1 <?php
2 class Example
3 {
4 protected $db;
5
6 public function __construct()
7 {
8 global $db;
9 $this->db = $db;
10 }
11
12 public function fetch()
13 {
14 return $this->db->query(...);
15 }
16 }
17 ?>
Now that we have converted global
calls to properties in this one class, we need to test the application to make sure it still works. However, since there is no formal testing system in place yet, we pseudo-test or spot check by browsing to or otherwise invoking files that use the modified class.
If we like, we can make an interim commit here once we are sure the application still works. We will not push to the central repository or notify QA just yet; all we want is a point to which we can roll back if later changes need to be undone.
Once we ascertain that the class works with the properties in place, we need to convert the global
calls in the constructor to use passed parameters instead. Given our Example
class above, the converted version might look like this:
classes/Example.php
1 <?php
2 class Example
3 {
4 protected $db;
5
6 public function __construct(Db $db)
7 {
8 $this->db = $db;
9 }
10
11 public function fetch()
12 {
13 return $this->db->query(...);
14 }
15 }
16 ?>
All we have done here is remove the global
call, and added a constructor parameter. We need to do this for every global
in the constructor.
Since the global
is for a particular class of object, we typehint the parameter to that class (in this case Db
). If possible, we should typehint to an interface instead, so if the Db
object implements a DbInterface, we should typehint to DbInterface. This will help with testing and later refactoring. We may also typehint to array
or callable
as appropriate. Not all global
calls are for typed values, so not all parameters will need typehints (e.g., when the parameter is expected to be a string).
After converting global
variables to constructor parameters, we will find that every instantiation of the class throughout the legacy application is now broken. This is because the constructor signature has changed. With that in mind, we now need to search the entire codebase (not just the classes) for instantiations of the class, and change the instantiations to the new signature.
To search for instantiations, we use our project-wide search facility to find uses of the new
keyword with our class name using a regular expression:
new\s+Example\W
The expression searches for the new
keyword, followed by at least one character of whitespace, followed by a terminating non-word character (such as a parenthesis, space, or semicolon).
Formatting Issues
Legacy codebases are notorious for having messed-up formatting, which means this expression is imperfect in some situations. The expression as given here may not find instantiations where, for example, the new
keyword is on one line, and the class name is the very next thing but is on the next line, not the same line.
Class Aliases With use
In PHP 5.3 and later, classes may be aliased to another class name with a use statement, like so:
1 <?php 2 use Example as Foobar; 3 // ... 4 $foo = new Foobar; 5 ?>
In this case, we need to do two searches: one for use \s+Example\s+as
to discover the various aliases, and a second search for the new keyword with the alias.
As we discover instantiations of the class in the codebase, we modify them to pass the parameters as needed. If, for example, a page script looks like this:
page_script.php
1 <?php
2 // a setup file that creates a $db variable
3 require 'includes/setup.php';
4 // ...
5 $example = new Example;
6 ?>
We need to add the parameter to the instantiation:
page_script.php
1 <?php
2 // a setup file that creates a $db variable
3 require 'includes/setup.php';
4 // ...
5 $example = new Example($db);
6 ?>
The new instantiations need to match the new constructor signature, so if the constructor takes more than one parameter, we need to pass all of the parameters.
We have reached the end of the conversion process for this class. We need to spot check the converted instantiations now, but (as always) this is not an automated process, so we need to run or otherwise invoke the files with the changed code. If there are problems, go back and fix them.
Once we have done so, and are sure there are no errors, we can commit the changed code, push it to our central repository, and notify QA that it needs to run its test suite over the legacy application.