Because we have changed the constructor signature, all the existing instantiations of ItemsGateway are now broken. We need to find all the places in the code where the ItemsGateway class is instantiated, and change the instantiations to pass a properly-constructed Db
object and an ItemFactory.
To do so, we use our project-wide search facility to search using a regular expression for our changed class name:
Search for:
new\s+ItemsGateway\(
Doing so will give us a list of all instantiations in the project. We need to review each result and change it by hand to instantiate the dependencies and pass them to the ItemsGateway.
For example, if a page script from the search results looks like this:
page_script.php
1 <?php
2 // $db_host, $db_user, and $db_pass are defined in the setup file
3 require 'includes/setup.php';
4
5 // ...
6
7 // create a gateway
8 $items_gateway = new ItemsGateway($db_host, $db_user, $db_pass);
9
10 // ...
11 ?>
We need to change it to something more like this:
page_script.php
1 <?php
2 // $db_host, $db_user, and $db_pass are defined in the setup file
3 require 'includes/setup.php';
4
5 // ...
6
7 // create a gateway with its dependencies
8 $db = new Db($db_host, $db_user, $db_pass);
9 $item_factory = new ItemFactory;
10 $items_gateway = new ItemsGateway($db, $item_factory);
11
12 // ...
13 ?>
Do this for each instantiation of the changed class.
Now that we have changed the class and the instantiations of the class throughout the codebase, we need to make sure our legacy application works. Again, we have no formal testing process in place, so we need to run or otherwise invoke the parts of the application that use the changed class and look for errors.
Once we feel sure that the application still operates properly, we commit the code, push it to our central repository, and notify QA that we are ready for them to test our new additions.
Search for the next new
keyword in a class, and start the process all over again. When we find that new
keywords exist only in Factory classes, our job is complete.
In this chapter, we concentrate on removing all use of the new
keyword, except inside Factory objects. I believe there are two reasonable exceptions to this rule: Exception classes themselves, and certain built-in PHP classes, such as the SPL classes.
It would be perfectly consistent with the process described in this chapter to create an ExceptionFactory
class, inject it into objects that throw exceptions, and then use the ExceptionFactory
to create the Exception
objects to be thrown. This strikes even me as going a bit too far. I think that Exception
objects are a reasonable exception to the rule of no new
outside Factory
objects.
Similarly, I think built-in PHP classes are also frequently an exception to the rule. While it would be nice to have, say, an ArrayObjectFactory or an ArrayIteratorFactory to create ArrayObject and ArrayIterator classes that are provided by SPL itself, it may be a little too much. Creating these kinds of objects directly inside the objects that use them is usually all right.
However, we need to be careful. Creating a complex or otherwise powerful object like a PDO
connection directly inside the class that needs it is probably overstepping our bounds. It's tough to describe a good rule of thumb here; when in doubt, err on the side of dependency injection.
Sometime we will discover classes that have dependencies, and the dependencies themselves have dependencies. These intermediary dependencies are passed to the outside class, which carries them along only so that the internal objects can be instantiated with them.
For example, say we have a Service
class that needs an ItemsGateway, which itself needs a Db
connection. Before removing global
variables, the Service
class might have looked like this:
classes/Service.php
1 <?php
2 class Service
3 {
4 public function doThis()
5 {
6 // ...
7 $db = global $db;
8 $items_gateway = new ItemsGateway($db);
9 $items = $items_gateway->selectAll();
10 // ...
11 }
12
13 public function doThat()
14 {
15 // ...
16 $db = global $db;
17 $items_gateway = new ItemsGateway($db);
18 $items = $items_gateway->selectAll();
19 // ...
20 }
21 }
22 ?>
After removing global
variables, we are left with a new
keyword, but we still need the Db
object as a dependency for ItemsGateway:
classes/Service.php
1 <?php
2 class Service
3 {
4 protected $db;
5
6 public function __construct(Db $db)
7 {
8 $this->db = $db;
9 }
10
11 public function doThis()
12 {
13 // ...
14 $items_gateway = new ItemsGateway($this->db);
15 $items = $items_gateway->selectAll();
16 // ...
17 }
18
19 public function doThat()
20 {
21 // ...
22 $items_gateway = new ItemsGateway($this->db);
23 $items = $items_gateway->selectAll();
24 // ...
25 }
26 }
27 ?>
How do we successfully remove the new
keyword here? The ItemsGateway needs a Db
connection. The Db
connection is never used by the Service
directly; it is used only for building the ItemsGateway.
The solution in cases like this is to inject a fully-constructed ItemsGateway. First, we modify the Service
class to receive its real dependency, the ItemsGateway:
classes/Service.php
1 <?php
2 class Service
3 {
4 protected $items_gateway;
5
6 public function __construct(ItemsGateway $items_gateway)
7 {
8 $this->items_gateway = $items_gateway;
9 }
10
11 public function doThis()
12 {
13 // ...
14 $items = $this->items_gateway->selectAll();
15 // ...
16 }
17
18 public function doThat()
19 {
20 // ...
21 $items = $this->items_gateway->selectAll();
22 // ...
23 }
24 }
25 ?>
Second, throughout the entire legacy application, we change all instantiations of the Service to pass an ItemsGateway.
For example, a page script might have done this when using global
variables everywhere:
page_script.php (globals)
1 <?php
2 // defines the $db connection
3 require 'includes/setup.php';
4
5 // creates the service with globals
6 $service = new Service;
7 ?>
And then we changed it to inject the intermediary dependency after removing globals:
page_script.php (intermediary dependency)
1 <?php
2 // defines the $db connection
3 require 'includes/setup.php';
4
5 // inject the Db object for the internal ItemsGateway creation
6 $service = new Service($db);
7 ?>
But we should finally change it to inject the real dependency:
page_script.php (real dependency)
1 <?php
2 // defines the $db connection
3 require 'includes/setup.php';
4
5 // create the gateway dependency and then the service
6 $items_gateway = new ItemsGateway($db);
7 $service = new Service($items_gateway);
8 ?>
I sometimes hear the complaint that using dependency injection means a lot of extra code to do the same thing as before.
It's true. Having a call like this, where the class manages its own dependencies internally.
Without dependency injection:
1 <?php 2 $items_gateway = new ItemsGateway; 3 ?>
This is obviously less code than using dependency injection by creating the dependencies and using Factory
objects.
With dependency injection:
1 <?php 2 $db = new Db($db_host, $db_user, $db_pass); 3 $item_factory = new ItemFactory; 4 $items_gateway = new ItemsGateway($db, $item_factory); 5 ?>
The real issue here, though, is not more code. The issues are more testable,more clear, and more decoupled.
In looking at the first example, how can we tell what ItemsGateway needs to operate? What other parts of the system will it affect? It's very difficult to tell without examining the entire class and looking for global
and new
keywords.
In looking at the second example, it is very easy to tell what the class needs to operate, what we can expect it to create, and what parts of the system it interacts with. These things additionally make it easier to test the class later.
In the examples above, our Factory
class only creates a single newInstance()
of an object. If we regularly create collections of objects, it may be reasonable to add a newCollection()
method to our Factory
. For example, given our ItemFactory above, we may do something like the following:
classes/ItemFactory.php
1 <?php
2 class ItemFactory
3 {
4 public function newInstance(array $item_data)
5 {
6 return new Item($item_data);
7 }
8
9 public function newCollection(array $items_data)
10 {
11 $collection = array();
12 foreach ($items_data as $item_data) {
13 $collection[] = $this->newInstance($item_data);
14 }
15 return $collection;
16 }
17 }
18 ?>
We may go so far as to create an ItemCollection
class for the collection instead of using an array. If so, it would be reasonable to use a new
keyword inside our ItemFactory
to create the ItemCollection
instance. (The ItemCollection
class is omitted here).
classes/ItemFactory.php
1 <?php
2 class ItemFactory
3 {
4 public function newInstance(array $item_data)
5 {
6 return new Item($item_data);
7 }
8
9 public function newCollection(array $item_rows)
10 {
11 $collection = new ItemCollection;
12 foreach ($item_rows as $item_data) {
13 $item = $this->newInstance($item_data);
14 $collection->append($item);
15 }
16 return $collection;
17 }
18 }
19 ?>
Indeed, we may wish to have a separate ItemCollectionFactory, using an injected ItemFactory to create Item objects, with its own newInstance()
method to return a new ItemCollection.
There are many variations on the proper use of Factory
objects. The key is to keep object creation (and related operations) separate from object manipulation.
All the dependency injection we have been doing so far has been manual injection, where we create the dependencies ourselves and then inject them as we create the objects we need. This can be a tedious process. Who wants to create a Db
object over and over again just so it can be injected into a variety of Gateway
classes? Isn't there some way to automate that?
Yes, there is. It is called a Container
. A Container
may go by various synonyms indicating how it is to be used. A Dependency Injection Container
is intended to be used always-and-only outside the non-Factory
classes, whereas an identical Container
implementation going by the name Service Locator
is intended to be used inside
non-Factory
objects.
Using a Container
brings distinct advantages:
Container
can house a Db
instance that only gets created when we ask the Container
for a database connection; the connection is created once and then reused over and over again.Container
, where objects that need multiple services for their constructor parameters can retrieve those services from the Container
inside their own creation logic.But using a Container
has disadvantages as well:
Container
used as a Service Locator replaces our global
variables with a fancy new toy that has many of the same problems as global
. The Container
hides dependencies because it is called only from inside the class that needs dependencies.At this stage of modernizing our legacy application it can be very tempting to start using a Container
to automate dependency injection for use. I suggest that we do not add one just now, because so much of our legacy application remains to be modernized. We will add one eventually, but it will be as the very last step of our modernization process.