When it comes to page scripts in legacy applications, it is very common to see business logic intertwined with presentation logic. For example, the page script does some setup work, then includes a header template, makes a call to the database, outputs the results, calculates some values, prints the calculated values, writes the values back to the database, and includes a footer template.

We have made some steps toward decoupling these concerns by extracting a domain layer for our legacy application. However, calls to the domain layer and other business logic within the page scripts are still mixed in with the presentation logic. Among other things, this intermingling of concerns makes it difficult to test the different aspects of our legacy application.

In this chapter, we will separate all of our presentation logic to its own layer so we can test it separately from our business logic.

For an example of embedded presentation logic, we can take a look at Appendix E, Code before Collecting.

Presentation Logic. The code shows a page script that has been refactored to use domain Transactions, but it still has some presentation logic entangled within the rest of the code.

The key to decoupling the presentation logic from the business logic is to put the code for them into separate scopes. The script should first perform all of the business logic, then pass the results over to the presentation logic. When that is complete, we will be able to test our presentation logic separately from our business logic.

To achieve this separation of scope, we will move toward using a Response object in our page scripts. All of our presentation logic will be executed from within a Response instance, instead of directly in the page script. Doing so will provide the scope separation that we need to decouple all output generation, including HTTP headers and cookies, from the rest of the page script.

Extracting presentation logic is not as difficult as extracting domain logic. However, it does require careful attention and lots of testing along the way.

In general, the process is as follows:

Now that we have a candidate page script, we need to rearrange the code so there is a clear demarcation between the presentation logic and everything else. For our example here, we will use the code in Appendix E, Code before Collecting.

First, we go to the bottom of the file and add a /* PRESENTATION */ comment as the final line. We then go back to the top of the file. Working line-by-line and block-by-block, we move all presentation logic to the end of the file after our /* PRESENTATION */ comment. When we are done, the part before the /* PRESENTATION */ comment should consist only of business logic, and the part after should consist only of presentation logic.

Given our starting code in Appendix E, Code before Collecting, we should end up with something more like the code in Appendix F, Code after Collecting. In particular, note that we have the following:

  • Moved variables not used by the business logic, such as $current_page, down the presentation block
  • Moved the header.php include down to the presentation block
  • Moved logic and conditions acting only on presentation variables, such as the if that sets the $page_title, to the presentation block
  • Replaced $_SERVER['PHP_SELF'] with an $action variable
  • Replaced $_GET['id'] with an $id variable

Now that we have rearranged the page script so that all presentation logic is collected at the end, we need to spot check to make sure the page script still works properly. As usual, we do this by running our pre-existing characterization tests, if we have any. If not, we must browse to or otherwise invoke the changed code.

If the page does not generate the same output as before, our rearrangement has changed the logic somehow. We need to undo and redo the rearrangement until the page works as it should.

Once our spot check is successful, we may wish to commit our changes so far. If our next set of changes goes badly, we can revert the code to this point as a known working state.

Now that we have a working page script with all the presentation logic in a single block, we will extract that entire block to its own file, and then use a Response to execute the extracted logic.

First, we need a place to put view files in our legacy application. While I prefer to keep presentation logic near the business logic, that kind of arrangement will make trouble for us in later modernization steps. As such, we will create a new directory in our legacy application called views/ and place our view files there. This directory should be at the same level as our classes/ and tests/ directories. For example:

Next, we cut the presentation block from the page script, and paste it into our new view file as-is.

Then, in place of the original presentation block in the page script, we create a Response object in our page script and point it to our view file with setView(). We also set up an empty call to setVars() for later, and finally call the send() method.

For example:

At this point, we have successfully decoupled the presentation logic from the page script. We can remove the /* PRESENTATION */ comment. It has served its purpose and is no longer needed.

However, this decoupling fundamentally breaks the presentation logic, because the view file depends on variables from the page script. With that in mind, we begin a spot check-and-modify cycle. We browse to or otherwise invoke the page script and discover that a particular variable is not available to the presentation. We add it to the setVars() array, and spot check again. We continue adding variables to the setVars() array until the view file has everything it needs, and our spot check runs become completely successful.

Given our earlier examples in Appendix E, Code before Collecting and Appendix F, Code after Collecting, we end up at Appendix G, Code after Response View File. We can see that the articles.html.php view file needed four variables: $id, $failure, $input, and $action:

1 <?php
2 // ...
3 $response->setVars(array(
4 'id' => $id,
5 'failure' => $article_transactions->getFailure(),
6 'input' => $article_transactions->getInput(),
7 'action' => $_SERVER['PHP_SELF'],
8 ));
9 // ...
10 ?>

Once we have a working page script, we may wish to commit our work yet again so that we have a known correct state to which we can revert later, if needed.

Unfortunately, most legacy applications pay little or no attention to output security. One of the most common vulnerabilities is cross-site scripting (XSS).

The defense against XSS is to escape all variables all the time for the context in which they are used. If a variable is used as HTML content, it needs to be escaped as HTML content; if a variable is used in an HTML attribute, it needs to be escaped as such, and so on.

Defending against XSS requires diligence on the part of the developer. If we remember one thing about escaping output, it should be the htmlspecialchars() function. Using this function appropriately will save us from most, but not all, XSS exploits.

When using htmlspecialchars(), we must be sure to pass a quotes constant and a character set each time. Thus, it is not enough to call htmlspecialchars($unescaped_text). We must call htmlspecialchars($unescaped_text, ENT_QUOTES, 'UTF-8'). So, output that looks like this:

This needs to be escaped like this:

Any time we send unescaped output, we need to be aware that we are likely opening up a security hole. As such, we must apply escaping to every variable we use for output.

Calling htmlspecialchars() repeatedly this way can be cumbersome, so the Response class provides an esc() method as an alias to htmlspecialchars() with reasonable settings:

Be aware that escaping via htmlspecialchars() is only a starting point. While escaping itself is simple to do, it can be difficult to know the appropriate escaping technique for a particular context.

Unfortunately, it is outside the scope of this book to provide a thorough overview of escaping and other security techniques. For more information, and for a good stand-alone escaping tool, please see the Zend\Escaper (https://framework.zend.com/manual/2.2/en/modules/zend.escaper) library.

After we escape all output in the Response view file, we can move along to testing.

Writing tests for view files presents some unique challenges. Until this chapter, all of our tests have been against classes and class methods. Because our view files are, well, files, we need to place them into a slightly different testing structure.

First, we need to create a views/ subdirectory in our tests/ directory. After that, our tests/ directory should look something like this:

Next, we need to modify the phpunit.xml file so it knows to scan through the new views/ subdirectory for tests:

Now that we have a location for our view file tests, we need to write one.

Although we are testing a file, PHPUnit requires each test to be a class. As a result, we will name our test for the view file being tested, and place it in a subdirectory under tests/views/ that mimics the original view file location. For example, if we have a view file at views/foo/bar/baz.html.php, we would create a test file at tests/views/foo/bar/BazHtmlTest.php. Yes, this is a bit ugly, but it will help us keep track of which tests map to which views.

In our test class, we will create a Response instance like the one at the end of our page script. We will pass into it the view file path and the needed variables. We will finally require the view, then check the output and headers to see if the view works correctly.

Given our articles.html.php file, our initial test might look like this:

When we run this test, it will fail. We rejoice, because the $expect value is empty, but the output should have a lot of content in it. This is the correct behavior. (If the test passes, something is probably wrong.)

Now we need our test to look at the output to see if it is correct.

The simplest way to do this is to dump the actual $this->output string and copy its value to the $expect variable. If the output string is relatively short, an assertSame($expect, $this->output) to make sure they are identical should be perfectly sufficient for our purposes.

However, if anything changes with any of the other files that our main view file includes, then the test will fail. The failure will occur not because the main view has changed, but because a related view has changed. That is not the kind of failure that helps us.

In the case of large output strings, we can look for an expected substring and make sure it it is present in the actual output. Then when the test fails it will be related to the particular substring for which we are testing, not to the entire output string a a whole.

For example, we can use strpos() to see if a particular string is in the output. If the haystack of $this->output does not contain the $expect needle, strpos() will return a boolean false. Any other value means the $needle is present. (This logic is easier to read if we write our own custom assertion method.)

This approach has the benefit of being very straightforward, but may not be suitable for complex assertions. We may wish to count the number of times an element occurs, or to assert that the HTML has a particular structure without referencing the contents of that structure, or to check that an element appears in the right place in the output.

For these more-complex content assertions, PHPUnit has an assertSelectEquals() assertion, along with other related assertSelect*() methods. These work by using CSS selectors to check different parts of the output, but can be difficult to read and understand.

Alternatively, we may prefer to install Zend\Dom\Query for finer manipulation of the DOM tree. This library also works by using CSS selectors to pick apart the content. It returns DOM nodes and node lists, which makes it very useful for testing the content in a fine-grained manner.

Unfortunately, I cannot give concrete advice on which of these approaches is best for you. I suggest starting with an approach similar to the assertOutputHas() method above, and moving along to the Zend\Dom\Query approach when it becomes obvious that you need a more powerful system.

After we have written tests that confirm the presentation works as it should, we move on to the last part of the process.

In the above examples we paid attention only to output from echo and print. However, it is often the case that a page script will also set HTTP headers via header(), setcookie(), and setrawcookie(). These, too, generate output.

Dealing with these output methods can be problematic. Whereas the Response class uses output buffering to capture echo and print into return values, there is no similar option for buffering calls to header() and related functions. Because the output from these functions is not buffered, we cannot easily test to see what's going on.

This is one place where having a Response object really helps us. The class comes with methods that buffer the header() and related native PHP functions, but do not call those functions until send() time. This allows us to capture the inputs to these calls and test them before they are actually activated.

For example, say we have some code like this in a contrived view file:

Among other things, we cannot test that the headers are what we expect them to be. PHP has already sent them to the client.

When using a view file with a Response object, we can prefix the native function calls with $this-> to call a Response method instead of the native PHP function. The Response methods buffer the arguments to the native calls instead of making the calls directly. This allows us to inspect the arguments before they are delivered as output.

We can now test the Response object to check the HTTP body as well as the HTTP headers.

Many times, a legacy application will have a view or template system already in place. If so, it may be sufficient to keep using the existing template system instead of introducing a new Response class.

If we decide to keep an existing template system, the other steps in this chapter still apply. We need to move all of the template calls to a single location at the end of the page script, disentangling all of the template interactions from the rest of the business logic. We can then display the template at the end of the page script. For example:

If we are not sending HTTP headers, this approach is just as testable as using a Response object. However, if we mix in calls to header() and related functions, our testability will be more limited.

In the interest of future-proofing our legacy code, we may move the template logic to a view file, and interact with a Response object in our page script instead. For example:

This allows us to keep using the existing template logic and files, while adding testability for HTTP headers via the Response object.

For consistency's sake, we should either use the existing template system or wrap all template logic in view files via Response objects. We should not use the template system in some page scripts and the Response object in others. In later chapters, it will be important that we have a single way of interacting with the presentation layer in our page scripts.

Most of the time, our presentation is small enough that it can be buffered into memory by PHP until it is ready to send. However, sometimes our legacy application may need to send large amounts of data, such as a file that is tens or hundreds of megabytes.

Reading a large file into memory so that we can output it to the user is usually not a good approach. Instead, we stream the file: we read a small piece of the file and send it to the user, then read the next small piece and send it to the user, and so on until the whole file has been delivered. That way, we never have to keep the entire file in memory.

The examples so far have only dealt with buffering a view into memory and then outputting it all at once, not with streaming. It would be a poor approach for our view file to read the entire resource into memory and then output it. At the same time, we need to make sure headers are delivered before any streamed content.

The Response object has a method to handle this situation. The Response method setLastCall() allows us to set a user-defined function (a callable) to invoke after requiring the view file and sending the headers. With this, we can pass a class method that will stream the resource out for us.

For example, say we need to stream out a large image file. We can write a class like the following to handle the stream logic:

There is much to be desired here, such as error checking and better resource handling, but it accomplishes the purpose for our example.

We can then create an instance of the FileStreamer in our page script, and the view file can use it as the callable argument for setLastCall():

At send() time, the Response will require the view file, which sets a header and the last call with arguments. The Response then sends the headers and the captured output of the view (which in this case is nothing). Finally, it invokes the callable and arguments from setLastCall(), which streams out the file.

In the example code from this chapter, we had only a handful of variables to pass to the presentation logic. Unfortunately, it is more likely that there will be 10 or 20 or more variables to pass. This is usually because the presentation is composed of several include files, each of which needs its own variables.

These extra variables are usually needed for things like the site header, navigation, and footer portions. Because we have decoupled the business logic from the presentation logic and are executing the presentation logic in a separate scope, we have to pass in all the variables needed for all the include files.

Say we have a view file that includes a header.php file, like this:

Our page script will have to pass $page_title, $page_style, and $site_nav variables in order for the header to display properly. This is a relatively tame case; there could be many more variables than this.

One solution is to collect commonly-used variables into one or more objects of their own. We can then pass those common-use objects into the _Response_ for the view file to use. For example, header-specific display variables can be placed in a HeaderDisplay class, which can then be passed to the _Response_.

We can then modify the header.php file to use the HeaderDisplay object, and the page script can pass an instance of HeaderDisplay instead of all the separate header-related variables.

When rearranging the page script to separate the business logic from the presentation logic, we may discover that the presentation code makes calls to Transactions or other classes or resources. This is a pernicious form of mixing concerns, since the presentation is dependent on the results of these calls.

If the called code is specifically for output, then there's no problem; we can leave the calls in place. But if the called code interacts with an external resource such as a database or a network connection, we have a mixing of concerns that needs to be separated.

The solution is to extract an equivalent set of business logic calls from the presentation logic, capture the results to a variable, and then pass that variable to the presentation.

For a contrived example, the following mixed code makes database calls and then presents them in a single loop:

Ignore for a moment that we need to solve the N+1 query problem presented in the example, and that this might better be solved at the Transactions level. How can we disentangle the presentation from the data retrieval?

In this case, we build an equivalent set of code to capture the needed data, then pass that data to the presentation logic, and apply proper escaping:

Yes, we end up looping over the same data twice -- once in the business logic, and once in the presentation logic. While this may reasonably be called inefficient in some ways, efficiency is not our primary goal. Separation of concerns is our primary goal, and this approach achieves that nicely.