Of the refactoring processes described in this book, extracting domain logic is going to be the most difficult, time consuming, and detail-oriented. This is a very tough thing to do, and it requires a lot of care and attention. The domain logic is the very core of our legacy application, and we need to make sure to pull out just the right parts. This means success is completely dependent on our familiarity and competence with the the legacy application as it exists now.
Luckily, our prior exercises in modernizing our legacy codebase have given us a broad overview of the application as a whole, as well as deep knowledge of the specific parts we have had to extract and refactor. This should endow us with the confidence to complete this task successfully. It is a demanding, but ultimately satisfying, activity.
In general, we proceed as follows:
Gateway
classes that exist outside Transactions
classes.Gateway
usage, examine the logic surrounding the Gateway
operations to discover which portions of that logic are related to the domain behaviors of the application.Transactions
classes related to the domain elements, and modify the original code to use the Transactions
class instead of the embedded domain logic.Transactions
logic, refining them along with the tested code until they pass.Gateway
classes, and continue extracting domain logic until Gateway
usage exists only in Transactions
.As in earlier chapters, we use our project-wide search facility to find where we create new instances of Gateway
classes:
Search for:
new .*Gateway
The new Gateway
instance may be used directly in a page script, in which case we have found some candidate code for extracting domain logic. If the Gateway
instance is injected into a class, we now need to dive into that class to find where the Gateway
is used. The code surrounding that usage will be our candidate for extracting domain logic.
When extracting logic to a class method, we should be careful to follow all the lessons we learned about dependency injection in prior chapters. Among other things, this means: no use of globals, replacing superglobals with a Request
object, no use of the new
keyword outside Factory
classes, and (of course) injecting objects via the constructor as needed.
After we have found some candidate code using a Gateway
, we need to examine the code surrounding Gateway
usage for these and other operations:
These and other pieces of logic are likely to be domain-related.
To successfully extract the domain logic to one or more Transactions
classes and methods, we will have to perform these and other activities:
Transactions
callsTransactions
classes and methodsDiscovery-and-extraction is best thought of as a learning exercise. Picking apart the legacy application like this is a way of learning how the application is constructed. As such, we should not be afraid to make multiple attempts at extraction. If our first attempt fails, ends up ugly, or gives poor results, we should feel no guilt about scrapping the work and starting over, having learned a little more about what works and what doesn't. For my own part, I often make two or three passes at extracting domain logic before the work is completed to my satisfaction. This is where a revision control system makes our life so much easier; we can work piecemeal, committing only as we are happy with the result, and reverting back to earlier stages if we need to begin again from a clean slate.
By way of example, recall the code we started with in Appendix B, Code before Gateways. Earlier in this chapter we mentioned that we had extracted embedded SQL statements to ArticlesGateway classes, ending up with the code in Appendix C, Code after Gateways. We now go from that to Appendix D, Code after Transaction Scripts where we have extracted the domain logic to an ArticleTransactions
class.
The extracted domain logic does not appear particularly complicated in its completed form, but actually doing the work turns out to be quite detailed. Review the Appendix C, Code after Gateways and compare to the Appendix D, Code after Transaction Scripts. Among other things, we should find the following:
ArticleTransactions
class and two separate methods, one for creating and one for updating. We named the ArticleTransactions
methods for the domain logic being performed, not for the implemenation of the underlying technical operations.ArticleTransactions
class for reuse across both of the transaction methods.ArticleTransactions
class receives ArticlesGateway
and UsersGateway
dependencies to manage the database interactions instead of making direct SQL calls.Transactions
class as properties.$failure
variable as it gets modified throughout the transaction. That code must now get failure information from the ArticleTransactions
class for later presentation.After the extraction, we have a classes/
directory structure that looks something like the following. This is a result of using a domain-oriented class structure when we extracted SQL to Gateway
classes:
/path/to/app/classes/
1 Domain/
2 Articles/
3 ArticlesGateway.php
4 ArticleTransactions.php
5 Users/
6 UsersGateway.php
This need not be our final refactoring. Further modifications of the ArticleTransactions
are still possible. For example, instead of injecting a UsersGateway
, it might make sense to extract various domain logic related to users into a UserTransactions
class and inject that instead. There is still a lot of repetition between the Transactions
methods. We also need better error checking and condition reporting in the Transactions
methods. These and other refactorings are secondary, and will be both more noticeable and easier to deal with only after the primary extraction of domain logic.
Once we have extracted one or more Transactions from the original code, we need to make sure the original code works when using the Transactions instead of the embedded domain logic. As before, we do this by running our pre-existing characterization tests. If we do not have characterization tests, we must browse to or otherwise invoke the changed code. If these tests fail, we rejoice! We have discovered that the extraction was flawed, and we have a chance to fix it before we deploy to production. If the "tests" pass, we likewise rejoice, and move on.
We now know the original code works with the newly extracted Transactions logic. However, the new classes and methods need their own set of tests. As with everything else related to extracting domain logic, writing these tests is likely to be detailed and demanding. The logic is probably convoluted, with lots of branches and loops. We should not let this deter us from testing. At the very least, we need to write tests that cover the main cases of the domain logic.
If necessary, we may refactor the extracted logic to separate methods that are themselves more easily testable. Breaking up the extracted logic will make it easier for us to see the flow and find repeated elements of logic. We must remember, though, that our goal is to maintain the existing behavior, not change the behavior presented by the legacy application.
For insights and techniques on how to make the extracted logic more testable, please see Refactoring (http://refactoring.com/)by Martin Fowler et al., as well as Working Effectively With Legacy Code (https://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/01311)by Michael Feathers.
Finally, because our testing and related refactoring of the extracted Transactions logic may have introduced some unexpected changes, we spot check the original code one more time using our characterization tests or by otherwise invoking the relevant code. If these fail, we rejoice! We have found out that our changes were not as good as we thought, and we have a chance to correct the code and tests before they get too far away from us.
When both the original code tests and the extracted Transactions tests pass, we rejoice again! We can now commit all of our new work, push it to the central repository, and notify QA that our modernized code is ready for them to review.
The term Transaction Script refers to an architectural pattern, and does not mean the the domain logic must be wrapped in an SQL transaction. It is easy to confuse the two ideas.
Having said that, keeping SQL transactions in mind may help us when extracting domain logic. One useful rule-of-thumb is that pieces of domain logic should be split up according to how well they would fit inside a single SQL transaction. That hypothetical transaction would be committed, or rolled back, as an atomic whole.
This singularity-of-purpose will help us determine where the boundaries of our domain logic lie. We do not actually add SQL transactions, it is just that thinking in those terms can give us some insight as to the boundaries of the domain logic.
When we extracted SQL statements to Gateway
classes, we sometimes found queries that were similar but not exactly identical. We had to determine if there was a way to combine them into a single method or not.
In the same way, we may discover that some parts of our legacy domain logic have been copied and pasted in two or more locations. When we find these, we have the same problem as with our Gateway
classes. Are the pieces of logic similar enough to be combined into a single method, or must they be different methods (or even in completely different Transactions
)?
The answer here is it depends. In some cases the repeated code will be an obvious copy of logic elsewhere, meaning we can reuse existing Transactions
methods. If not, we need to extract to a new Transactions
class or method.
There is also a middle path, where the domain logic as a whole is different, but there are support elements of logic that are identical across different Transactions
. In these cases, we can refactor the supporting logic as methods on an abstract base Transactions
class, and then extend new Transactions
from it. Alternatively, we can extract the logic to a supporting class and inject it into our Transactions
.
Our Transactions
classes should not be using print
or echo
. The domain logic should only return or retain data.
When we discover output generation in the middle of our domain logic, we should extract that portion so that it lies outside of the domain logic. In general, this means collecting output in the Transactions
class and then either returning it or making it available by a separate method. Leave output generation for the presentation layer.
In the examples, we showed Transactions as a collection of methods related to a particular domain entity, such as ArticleTransactions. Each part of the domain logic related to that entity was wrapped in a class method.
However, it is also reasonable to break up domain logic into a one-class-per-transaction structure. Indeed, some transactions may be complex enough that they truly require their own separate classes. There is nothing wrong with using a single class to represent a single domain logic transaction.
For example, the earlier ArticleTransactions class might be split into an abstract base class with support methods, and two concrete classes for each of the extracted pieces of domain logic. Each of the concrete classes extends the AbstractArticleTransaction, like so:
classes/
1 Domain/
2 Articles/
3 ArticlesGateway.php
4 Transaction/
5 AbstractArticleTransaction.php
6 SubmitNewArticleTransaction.php
7 UpdateExistingArticleTransaction.php
8 Users/
9 UsersGateway.php
If we use a one-class-per-transaction approach, what do we name the main method on the single-transaction class, the one that actually performs the transaction? If there is a common convention for main methods that already exist in our legacy codebase, we should adhere to that convention. Otherwise, we need to pick a single consistent method name. Personally, I enjoy co-opting the __invoke()
magic method for this purpose, but you may wish to use exec()
or some other appropriate term to indicate we are executing or otherwise performing the transaction.
When we extracted our SQL statements to Gateway
classes, it is possible that we moved some domain logic into them instead of leaving that logic in its original location. At that earlier point in our refactoring work, it was very easy to confuse domain-level input filtering (which makes sure the data conforms to a domain-specific state) with database-level filtering (which makes sure the data is safe to use with the database).
Now we can more easily tell the difference between the two. If we discover that there is domain-level logic in our Gateway
classes, we should probably extract it to our Transactions
classes instead. We need to be sure to update the relevant tests as well.
The examples in this chapter show domain logic embedded in page scripts. It is just as likely that we have domain logic embedded in classes as well. If the class can reasonably be considered part of the domain, and contains only domain-related logic, but is not named for the domain, it may be wise to move the class into the domain namespace.
Otherwise, if the class has any responsibilities other than domain logic, we may proceed to extract the domain logic from it in the same way that we extracted logic from a page script. After the extraction, the original class will then need to have the relevant Transactions
class injected as a dependency. The original class should then make calls to the Transactions
as appropriate.