Test-First

We never have enough time for testing, so let’s just write the test first.

—Kent Beck

Test-First is a Built-In Quality practice derived from Extreme Programming (XP) that recommends building tests before writing code to improve delivery by focusing on the intended results.

Agile testing differs from the big-bang, deferred testing approach of traditional development. Instead, the code is developed and tested in small increments, often with the development of the test done ahead of writing the code. This way, tests help elaborate and better define the intended system behavior even before the system is coded. Quality is built in from the beginning. This just-in-time approach to the elaboration of the proposed system behavior also mitigates the need for the overly detailed requirement specifications and sign-offs that are often used in traditional software development to control quality. Even better, these tests, unlike conventionally written requirements, are automated wherever possible. Even if they’re not automated, they still provide a definitive statement of what the system does, rather than a statement of early thoughts about what it was supposed to do.

Details

Agile testing is a continuous process that’s integral to Lean and built-in quality. In other words, Agile Teams and Agile Release Trains (ARTs) can’t go fast without high quality, and they can’t achieve that goal without continuous testing and, wherever possible, testing first.

The Agile Testing Matrix

Brian Marick, an XP proponent and one of the authors of the Agile Manifesto, helped pioneer Agile testing by describing a matrix that guides the reasoning behind such tests. This approach was further developed in Agile Testing and extended for the scaling Agile paradigm in Agile Software Requirements [1, 2]. Figure 1 describes and extends Marick’s original matrix with guidance on what to test and when.

The Agile Testing matrix shows four quadrants Q1 (bottom-left), Q2 (top-left), Q3 (top-right), and Q4 (bottom-right). The horizontal axis of the matrix contains the Business-facing and Technology-facing tests; the vertical axis contains: Supporting Development and Critiquing the solution. Q1 and Q2 are collectively labeled "Test-First." Q1 includes: Unit tests and Component tests. Q2 contains the functional tests. They include: Story acceptance tests, feature/capability acceptance tests, and enabler acceptance tests. Q3 is labeled Validate. It contains the solution/system acceptance tests. They include: scenario tests, exploratory tests, user acceptance tests, and alpha and beta tests. Q4 is labeled Test Continuously. It includes the System qualities tests: performance and load, security, other NFRs, and enabler tests. Q1 is indicated to be Automated, Q2 is Automated and Manual, Q3 is Manual, and Q4 is indicated to be done by Tools.

Figure 1. Agile testing matrix

The horizontal axis of the matrix contains business- or technology-facing tests. Business-facing tests are understandable by the user and written using business terminology. Technology-facing tests are written in the language of the developer and are used to evaluate whether the code delivers the behaviors the developer intended.

The vertical axis contains tests supporting development (evaluating internal code) or critiquing the solution (evaluating the system against the user’s requirements).

Classification into the four quadrants (Q1–Q4) of the Agile testing matrix enables a comprehensive testing strategy that helps ensure quality:

Q1 – Contains unit and component tests. Tests are written to run before and after code changes to confirm that the system works as intended.
Q2 – Contains functional tests (user acceptance tests) for Stories, Features, and Capabilities to validate that they work the way the Product Owner (or Customer/user) intended. Feature- and capability-level acceptance tests confirm the aggregate behavior of many user stories. Teams automate these tests whenever possible and use manual ones only when there is no other choice.
Q3 – Contains system-level acceptance tests to validate that the behavior of the whole system meets the usability and functionality requirements, including scenarios that are often encountered in actual system use. These tests may include exploratory tests, user acceptance test, scenario-based tests, and final usability tests. Because they involve users and testers engaged in real or simulated deployment scenarios, the Q3 tests are often manual. They’re frequently the final system validation before delivery of the system to the end user.
Q4 – Contains system qualities testing to verify the system meets its Nonfunctional Requirements (NFRs), as exhibited in part by Enabler tests. These tests are typically supported by a suite of automated testing tools, such as those examining load and performance, designed specifically for this purpose. Since any system changes can violate conformance with NFRs, they must be run continuously, or at least whenever it’s practical.

Quadrants 1 and 2 in Figure 1 define the functionality of the system. Test-first practices include both Test-Driven Development (TDD) and Acceptance Test–Driven Development (ATDD). Both involve creating the test before developing the code, and both use test automation to support continuous integration, team velocity, and development effectiveness. The next section describes Q1 and Q2. The companion chapters, Release on Demand and Nonfunctional Requirements, describe Q3 and Q4, respectively.

Test-Driven (Test-First) Development

Beck and others have defined a set of XP practices under the umbrella label of TDD [3]:

Write the test first, which ensures that the developer understands the required behavior.
Run the test and watch it fail. Because there is no code yet, this might seem silly initially, but it accomplishes two useful objectives: It verifies that the test works, including its harnesses, and it demonstrates how the system will behave if the code is incorrect.
Write the minimum amount of code needed to pass the test. If it fails, rework the code or the test until it routinely passes.

In XP, this practice was designed primarily to operate in the context of unit tests, which are developer-written tests (and code) that evaluate the classes and methods used. Unit tests are considered a form of ‘white-box testing,’ because they test the internal workings of the system and the various code paths. In pair work, two people collaborate to develop the code and tests simultaneously; this practice provides a built-in peer review, which helps assure high quality. Even when not developed through pair work, the tests ensure that another set of eyes review the code. Developers often refactor the code to pass the test as simply and elegantly as possible, which is one of the main reasons that SAFe relies on TDD.

Unit Tests

Most TDD involves unit testing, which prevents quality assurance (QA) and test personnel from spending most of their time finding and reporting on code-level bugs. Instead, these personnel can focus on system-level testing challenges, where more complex behaviors are identified based on the interactions between unit code modules. The open source community has built unit testing frameworks to cover most languages, including Java, C, C#, C++, XML, HTTP, and Python. In fact, unit-testing frameworks are available for most coding constructs a developer is most likely to encounter. They provide a harness for the development and maintenance of unit tests and for automatically executing them against the system.

Because unit tests are written before or concurrently with the code, and their frameworks include test execution automation, unit testing can occur within the same Iteration. Moreover, the unit test frameworks hold and manage the accumulated unit tests. As a result, regression testing automation for unit tests is mostly free for the team. Unit testing is a cornerstone of software agility, and any investment made in comprehensive unit testing usually improves the organization’s quality and productivity.

Component Tests

Similarly, teams use tests to evaluate larger-scale components of the system. Many of these components extend into multiple architectural layers, where they provide services needed by features or other modules. Testing tools and practices for implementing component tests vary. For example, testing frameworks can hold complicated unit tests written in the framework’s language (e.g., Java, C, C#). As a result, many teams use their unit testing frameworks to build component tests. They may not even think of them as separate functions, as they are merely part of the testing strategy. In other cases, developers may incorporate other testing tools or write entirely customized tests in any language or environment that supports testing of the broader system behaviors. These tests, which are automated, serve as a primary defense against unanticipated consequences of refactoring and new code.

Acceptance Test–Driven Development

Quadrant 2 of the Agile testing matrix shows that the test-first philosophy applies to testing of stories, features, and capabilities just as it does to unit testing—an approach called Acceptance Test–Driven Development (ATDD). Whether ATDD is adopted formally or informally, many teams find it more efficient to write the acceptance test first, before developing the code. After all, the goal is to have the whole system work as intended.

Ken Pugh notes that the emphasis with this approach is more on expressing requirements in unambiguous terms than on focusing on the test per se [4]. He further observes that there are three alternative labels for this detailing process: ATDD, Specification by Example (SBE), and Behavior-Driven Design (BDD). Although these approaches actually have some slight differences, they all emphasize understanding requirements before implementation. In particular, SBE suggests that Product Owners should provide realistic examples instead of abstract statements, as they often do not write the acceptance tests themselves.

Whether ATDD is viewed as a form of requirements expression or as a test, the understanding is that the result is the same. Acceptance tests serve to record the decisions made in the conversation between the team and the Product Owner so that the team understands the specifics of the intended behavior the story represents. (See the 3Cs in the “Writing Good Stories” section of the Story chapter, referring to the card, conversation, and confirmation.)

Functional Tests

Story acceptance tests confirm that each new user story implemented delivers its intended behavior during the iteration. If these stories work as intended, then it’s likely that each increment of software will ultimately satisfy the needs of the users.

During a Program Increment (PI), feature and capability acceptance testing are performed, using similar tests. The difference is that capability tests operate at the next level of abstraction, typically showing how several stories work together to deliver a more significant amount of value to the user. Of course, multiple feature acceptance tests can be associated with a more complex feature. The same goes for stories, with tests being needed to verify that the system works as intended for all levels of abstraction.

The following are characteristics of functional tests:

Written in the language of the business
Developed in a conversation between developers, testers, and the Product Owner
‘Black-box tested’ to verify only the outputs of the system meet its conditions of satisfaction, without concern for the internal workings of the system
Run in the same iteration as the code development

Although everyone can write tests, the Product Owner as Business Owner/customer proxy is responsible for the efficacy of the tests. If a story does not pass its test, the teams get no credit for that story, and it’s carried over into the next iteration to fix either the test or the code.

Features, capabilities, and stories must pass one or more acceptance tests to meet their Definition of Done. Stories realize the intended features and capabilities, and multiple tests may be associated with a particular work item.

Automating Acceptance Testing

Because acceptance tests run at a level above the code, a variety of approaches have been proposed for executing them, including handling them as manual tests. Manual tests tend to pile up very quickly: The faster you go, the faster they grow, and then the slower you go. Eventually, the amount of manual work required to run regression testing slows down the team and causes delays in value delivery.

To avoid this pattern, teams must automate most of their acceptance tests. They can use a variety of tools for this purpose, including the target programming language (e.g., Perl, PHP, Python, Java) or natural language as supported by specific testing frameworks, such as Cucumber. Alternatively, they may use table formats such as the Framework for Integrated Testing (FIT). The preferred approach is to use a higher level of abstraction that works against the business logic of the application, which prevents the presentation layer or other implementation details from blocking testing.

Acceptance Test Template/Checklist

An ATDD checklist can help the team consider a simple list of things to do, review, and discuss each time a new story appears. Agile Software Requirements provides an example of a story acceptance-testing checklist [2].

LEARN MORE

[1] Crispin, Lisa, and Janet Gregory. Agile Testing: A Practical Guide for Testers and Agile Teams. Addison-Wesley, 2009.

[2] Leffingwell, Dean. Agile Software Requirements: Lean Requirements Practices for Teams, Programs, and the Enterprise. Addison-Wesley, 2011.

[3] Beck, Kent. Test-Driven Development. Addison-Wesley, 2003.

[4] Pugh, Ken. Lean-Agile Acceptance Test-Driven Development: Better Software Through Collaboration. Addison-Wesley, 2011.