Test
Functions
Each test file must import the testing
package.
Test functions have the following signature:
func TestName(t *testing.T) { // ... }
Test function names must begin with Test
;
the optional suffix Name
must begin with
a capital letter:
func TestSin(t *testing.T) { /* ... */ } func TestCos(t *testing.T) { /* ... */ } func TestLog(t *testing.T) { /* ... */ }
The t
parameter provides methods for reporting test failures and
logging additional information. Let’s define an example package
gopl.io/ch11/word1
, containing a single function IsPalindrome
that reports whether a string reads the same forward and backward.
(This implementation tests every byte twice if the string is a
palindrome; we’ll come back to that shortly.)
// Package word provides utilities for word games. package word // IsPalindrome reports whether s reads the same forward and backward. // (Our first attempt.) func IsPalindrome(s string) bool { for i := range s { if s[i] != s[len(s)-1-i] { return false } } return true }
In the same directory, the file word_test.go
contains two test
functions named TestPalindrome
and TestNonPalindrome
.
Each checks that IsPalindrome
gives the
right answer for a single input and reports failures using t.Error
:
package word import "testing" func TestPalindrome(t *testing.T) { if !IsPalindrome("detartrated") { t.Error(`IsPalindrome("detartrated") = false`) } if !IsPalindrome("kayak") { t.Error(`IsPalindrome("kayak") = false`) } } func TestNonPalindrome(t *testing.T) { if IsPalindrome("palindrome") { t.Error(`IsPalindrome("palindrome") = true`) } }
A go test
(or go build
) command with no package
arguments operates on the package in the current directory.
We can build and run the tests with the following command.
$ cd $GOPATH/src/gopl.io/ch11/word1 $ go test ok gopl.io/ch11/word1 0.008s
Satisfied, we ship the program, but no sooner have the launch party
guests departed than the bug reports start to arrive. A French user
named Noelle Eve Elleon complains that IsPalindrome
doesn’t recognize
“été.” Another, from Central America, is disappointed that it rejects “A man, a plan, a
canal: Panama.” These specific and small bug reports naturally lend
themselves to new test cases.
func TestFrenchPalindrome(t *testing.T) { if !IsPalindrome("été") { t.Error(`IsPalindrome("été") = false`) } } func TestCanalPalindrome(t *testing.T) { input := "A man, a plan, a canal: Panama" if !IsPalindrome(input) { t.Errorf(`IsPalindrome(%q) = false`, input) } }
To avoid writing the long input
string twice, we use Errorf
,
which provides formatting like Printf
.
When the two new tests have been added, the go test
command fails with
informative error messages.
$ go test --- FAIL: TestFrenchPalindrome (0.00s) word_test.go:28: IsPalindrome("été") = false --- FAIL: TestCanalPalindrome (0.00s) word_test.go:35: IsPalindrome("A man, a plan, a canal: Panama") = false FAIL FAIL gopl.io/ch11/word1 0.014s
It’s good practice to write the test first and observe that it triggers the same failure described by the user’s bug report. Only then can we be confident that whatever fix we come up with addresses the right problem.
As a bonus, running go test
is usually quicker than manually going
through the steps described in the bug report, allowing us to iterate
more rapidly. If the test suite contains many slow tests, we may make
even faster progress if we’re selective about which ones we run.
The -v
flag prints the name and execution time of each test in
the package:
$ go test -v === RUN TestPalindrome --- PASS: TestPalindrome (0.00s) === RUN TestNonPalindrome --- PASS: TestNonPalindrome (0.00s) === RUN TestFrenchPalindrome --- FAIL: TestFrenchPalindrome (0.00s) word_test.go:28: IsPalindrome("été") = false === RUN TestCanalPalindrome --- FAIL: TestCanalPalindrome (0.00s) word_test.go:35: IsPalindrome("A man, a plan, a canal: Panama") = false FAIL exit status 1 FAIL gopl.io/ch11/word1 0.017s
and the -run
flag, whose argument is a regular expression, causes
go test
to run only those tests whose function name matches the
pattern:
$ go test -v -run="French|Canal" === RUN TestFrenchPalindrome --- FAIL: TestFrenchPalindrome (0.00s) word_test.go:28: IsPalindrome("été") = false === RUN TestCanalPalindrome --- FAIL: TestCanalPalindrome (0.00s) word_test.go:35: IsPalindrome("A man, a plan, a canal: Panama") = false FAIL exit status 1 FAIL gopl.io/ch11/word1 0.014s
Of course, once we’ve gotten the selected tests to pass, we should
invoke go test
with no flags to run the entire test suite one last
time before we commit the change.
Now our task is to fix the bugs.
A quick investigation reveals the cause of the first bug to be
IsPalindrome
’s use of byte sequences, not rune sequences, so that
non-ASCII characters such as the é in "été"
confuse it.
The second bug arises
from not ignoring spaces, punctuation, and letter case.
Chastened, we rewrite the function more carefully:
// Package word provides utilities for word games. package word import "unicode" // IsPalindrome reports whether s reads the same forward and backward. // Letter case is ignored, as are non-letters. func IsPalindrome(s string) bool { var letters []rune for _, r := range s { if unicode.IsLetter(r) { letters = append(letters, unicode.ToLower(r)) } } for i := range letters { if letters[i] != letters[len(letters)-1-i] { return false } } return true }
We also write a more comprehensive set of test cases that combines all the previous ones and a number of new ones into a table.
func TestIsPalindrome(t *testing.T) { var tests = []struct { input string want bool }{ {"", true}, {"a", true}, {"aa", true}, {"ab", false}, {"kayak", true}, {"detartrated", true}, {"A man, a plan, a canal: Panama", true}, {"Evil I did dwell; lewd did I live.", true}, {"Able was I ere I saw Elba", true}, {"été", true}, {"Et se resservir, ivresse reste.", true}, {"palindrome", false}, // non-palindrome {"desserts", false}, // semi-palindrome } for _, test := range tests { if got := IsPalindrome(test.input); got != test.want { t.Errorf("IsPalindrome(%q) = %v", test.input, got) } } }
Our new tests pass:
$ go test gopl.io/ch11/word2 ok gopl.io/ch11/word2 0.015s
This style of table-driven testing is very common in Go. It is straightforward to add new table entries as needed, and since the assertion logic is not duplicated, we can invest more effort in producing a good error message.
The output of a failing test does not include the entire
stack trace at the moment of the call to t.Errorf
. Nor does
t.Errorf
cause a panic or stop the execution of the test, unlike
assertion failures in many test frameworks for other languages. Tests
are independent of each other. If an
early entry in the table causes the test to fail, later table entries
will still be checked, and thus we may learn about multiple failures
during a single run.
When we really must stop a test function, perhaps because some
initialization code failed or to prevent a failure already reported
from causing a confusing cascade of others, we use t.Fatal
or
t.Fatalf
.
These must be called from the same goroutine as the Test
function, not
from another one created during the test.
Test failure messages are usually of the form "f(x) = y, want z"
,
where f(x)
explains the attempted operation and its input, y
is the
actual result, and z
the expected result. Where convenient, as in
our palindrome example, actual Go syntax is used for the f(x)
part.
Displaying x
is particularly important in a table-driven test, since
a given assertion is executed many times with different values.
Avoid boilerplate and redundant information.
When testing a boolean function such as IsPalindrome
, omit the
want z
part since it adds no information.
If x
, y
, or z
is lengthy, print a concise summary
of the relevant parts instead.
The author of a test should strive to help
the programmer who must diagnose a test failure.
Exercise 11.1:
Write tests for the charcount
program in Section 4.3.
Exercise 11.2:
Write a set of tests for IntSet
(§6.5)
that checks that its behavior after each operation is
equivalent to a set based on built-in maps.
Save your implementation for benchmarking in Exercise 11.7.
Table-driven tests are convenient for checking that a function works on inputs carefully selected to exercise interesting cases in the logic. Another approach, randomized testing, explores a broader range of inputs by constructing inputs at random.
How do we know what output to expect from our function, given a random input? There are two strategies. The first is to write an alternative implementation of the function that uses a less efficient but simpler and clearer algorithm, and check that both implementations give the same result. The second is to create input values according to a pattern so that we know what output to expect.
The example below uses the second approach: the randomPalindrome
function generates words that are known to be palindromes by construction.
import "math/rand" // randomPalindrome returns a palindrome whose length and contents // are derived from the pseudo-random number generator rng. func randomPalindrome(rng *rand.Rand) string { n := rng.Intn(25) // random length up to 24 runes := make([]rune, n) for i := 0; i < (n+1)/2; i++ { r := rune(rng.Intn(0x1000)) // random rune up to '\u0999' runes[i] = r runes[n-1-i] = r } return string(runes) } func TestRandomPalindromes(t *testing.T) { // Initialize a pseudo-random number generator. seed := time.Now().UTC().UnixNano() t.Logf("Random seed: %d", seed) rng := rand.New(rand.NewSource(seed)) for i := 0; i < 1000; i++ { p := randomPalindrome(rng) if !IsPalindrome(p) { t.Errorf("IsPalindrome(%q) = false", p) } } }
Since randomized tests are nondeterministic, it is critical that the
log of the failing test record sufficient information to reproduce
the failure. In our example, the input p
to IsPalindrome
tells us
all we need to know, but for functions that accept more complex
inputs, it may be simpler to log the seed of the pseudo-random number
generator (as we do above) than to dump the entire input data
structure. Armed with that seed value, we can easily modify the test
to replay the failure deterministically.
By using the current time as a source of randomness, the test will explore novel inputs each time it is run, over the entire course of its lifetime. This is especially valuable if your project uses an automated system to run all its tests periodically.
Exercise 11.3:
TestRandomPalindromes
only tests palindromes.
Write a randomized test that generates and verifies non-palindromes.
Exercise 11.4:
Modify randomPalindrome
to exercise IsPalindrome
’s
handling of punctuation and spaces.
The go test
tool is useful for testing library packages, but with a little
effort we can use it to test commands as well.
A package named main
ordinarily produces an executable program,
but it can be imported as a library too.
Let’s write a test for the echo
program of Section 2.3.2.
We’ve split the program into two functions: echo
does the real work, while
main
parses and reads the flag values and reports any errors returned
by echo
.
// Echo prints its command-line arguments. package main import ( "flag" "fmt" "io" "os" "strings" ) var ( n = flag.Bool("n", false, "omit trailing newline") s = flag.String("s", " ", "separator") ) var out io.Writer = os.Stdout // modified during testing func main() { flag.Parse() if err := echo(!*n, *s, flag.Args()); err != nil { fmt.Fprintf(os.Stderr, "echo: %v\n", err) os.Exit(1) } } func echo(newline bool, sep string, args []string) error { fmt.Fprint(out, strings.Join(args, sep)) if newline { fmt.Fprintln(out) } return nil }
From the test, we will call echo
with a variety of arguments and
flag settings and check that it prints the correct output in each
case, so we’ve added parameters to echo
to reduce its dependence
on global variables. That said, we’ve also introduced another global
variable, out
, the io.Writer
to which the result will be written.
By having echo
write through this variable, not directly to
os.Stdout
, the tests can substitute a different Writer
implementation that records what was written for later inspection.
Here’s the test, in file echo_test.go
:
package main import ( "bytes" "fmt" "testing" ) func TestEcho(t *testing.T) { var tests = []struct { newline bool sep string args []string want string }{ {true, "", []string{}, "\n"}, {false, "", []string{}, ""}, {true, "\t", []string{"one", "two", "three"}, "one\ttwo\tthree\n"}, {true, ",", []string{"a", "b", "c"}, "a,b,c\n"}, {false, ":", []string{"1", "2", "3"}, "1:2:3"}, } for _, test := range tests { descr := fmt.Sprintf("echo(%v, %q, %q)", test.newline, test.sep, test.args) out = new(bytes.Buffer) // captured output if err := echo(test.newline, test.sep, test.args); err != nil { t.Errorf("%s failed: %v", descr, err) continue } got := out.(*bytes.Buffer).String() if got != test.want { t.Errorf("%s = %q, want %q", descr, got, test.want) } } }
Notice that the test code is in the same package as the production code.
Although the package name is main
and it defines a main
function,
during testing this package acts as a library that exposes the
function TestEcho
to the test driver; its main
function is
ignored.
By organizing the test as a table, we can easily add new test cases. Let’s see what happens when the test fails, by adding this line to the table:
{true, ",", []string{"a", "b", "c"}, "a b c\n"}, // NOTE: wrong expectation!
go test
prints
$ go test gopl.io/ch11/echo --- FAIL: TestEcho (0.00s) echo_test.go:31: echo(true, ",", ["a" "b" "c"]) = "a,b,c", want "a b c\n" FAIL FAIL gopl.io/ch11/echo 0.006s
The error message describes the attempted operation (using Go-like syntax), the actual behavior, and the expected behavior, in that order. With an informative error message such as this, you may have a pretty good idea about the root cause before you’ve even located the source code of the test.
It’s important that code being tested not call log.Fatal
or os.Exit
,
since these will stop the process in its tracks; calling these
functions should be regarded as the exclusive right of main
. If
something totally unexpected happens and a function panics, the test driver will
recover, though the test will of course be considered a failure. Expected
errors such as those resulting from bad user input, missing files, or
improper configuration should be reported by returning a
non-nil error
value.
Fortunately (though unfortunate as an illustration), our echo
example is so simple that it will never return a non-nil error.
One way of categorizing tests is by the level of knowledge they require of the internal workings of the package under test. A black-box test assumes nothing about the package other than what is exposed by its API and specified by its documentation; the package’s internals are opaque. In contrast, a white-box test has privileged access to the internal functions and data structures of the package and can make observations and changes that an ordinary client cannot. For example, a white-box test can check that the invariants of the package’s data types are maintained after every operation. (The name white box is traditional, but clear box would be more accurate.)
The two approaches are complementary. Black-box tests are usually more robust, needing fewer updates as the software evolves. They also help the test author empathize with the client of the package and can reveal flaws in the API design. In contrast, white-box tests can provide more detailed coverage of the trickier parts of the implementation.
We’ve already seen examples of both kinds.
TestIsPalindrome
calls only the exported function
IsPalindrome
and is thus a black-box test.
TestEcho
calls the echo
function and updates the global
variable out
, both of which are unexported, making it a
white-box test.
While developing TestEcho
, we modified the echo
function to
use the package-level variable out
when writing its output, so
that the test could replace the standard output with an alternative
implementation that records the data for later inspection.
Using the same technique, we can replace other parts of the production
code with easy-to-test “fake” implementations.
The advantage of fake implementations is that they can be simpler to
configure, more predictable, more reliable, and easier to observe.
They can also avoid undesirable side effects such as updating a
production database or charging a credit card.
The code below shows the quota-checking logic in a web service that provides networked storage to users. When users exceed 90% of their quota, the system sends them a warning email.
package storage import ( "fmt" "log" "net/smtp" ) var usage = make(map[string]int64) func bytesInUse(username string) int64 { return usage[username] } // Email sender configuration. // NOTE: never put passwords in source code! const sender = "notifications@example.com" const password = "correcthorsebatterystaple" const hostname = "smtp.example.com" const template = `Warning: you are using %d bytes of storage, %d%% of your quota.` func CheckQuota(username string) { used := bytesInUse(username) const quota = 1000000000 // 1GB percent := 100 * used / quota if percent < 90 { return // OK } msg := fmt.Sprintf(template, used, percent) auth := smtp.PlainAuth("", sender, password, hostname) err := smtp.SendMail(hostname+":587", auth, sender, []string{username}, []byte(msg)) if err != nil { log.Printf("smtp.SendMail(%s) failed: %s", username, err) } }
We’d like to test it, but we don’t want the test to send out real
email. So we move the email logic into its own function and store
that function in an unexported package-level variable, notifyUser
.
var notifyUser = func(username, msg string) { auth := smtp.PlainAuth("", sender, password, hostname) err := smtp.SendMail(hostname+":587", auth, sender, []string{username}, []byte(msg)) if err != nil { log.Printf("smtp.SendEmail(%s) failed: %s", username, err) } } func CheckQuota(username string) { used := bytesInUse(username) const quota = 1000000000 // 1GB percent := 100 * used / quota if percent < 90 { return // OK } msg := fmt.Sprintf(template, used, percent) notifyUser(username, msg) }
We can now write a test that substitutes a simple fake notification mechanism instead of sending real email. This one records the notified user and the contents of the message.
package storage import ( "strings" "testing" ) func TestCheckQuotaNotifiesUser(t *testing.T) { var notifiedUser, notifiedMsg string notifyUser = func(user, msg string) { notifiedUser, notifiedMsg = user, msg } const user = "joe@example.org" usage[user]= 980000000 // simulate a 980MB-used condition CheckQuota(user) if notifiedUser == "" && notifiedMsg == "" { t.Fatalf("notifyUser not called") } if notifiedUser != user { t.Errorf("wrong user (%s) notified, want %s", notifiedUser, user) } const wantSubstring = "98% of your quota" if !strings.Contains(notifiedMsg, wantSubstring) { t.Errorf("unexpected notification message <<%s>>, "+ "want substring %q", notifiedMsg, wantSubstring) } }
There’s one problem: after this test function has returned,
CheckQuota
no longer works as it should because it’s still using the
test’s fake implementation of notifyUsers
. (There is always a risk
of this kind when updating global variables.) We must modify the test
to restore the previous value so that subsequent tests observe no
effect, and we must do this on all execution paths, including test
failures and panics. This naturally suggests defer
.
func TestCheckQuotaNotifiesUser(t *testing.T) { // Save and restore original notifyUser. saved := notifyUser defer func() { notifyUser = saved }() // Install the test's fake notifyUser. var notifiedUser, notifiedMsg string notifyUser = func(user, msg string) { notifiedUser, notifiedMsg = user, msg } // ...rest of test... }
This pattern can be used to temporarily save and restore all kinds of global variables, including command-line flags, debugging options, and performance parameters; to install and remove hooks that cause the production code to call some test code when something interesting happens; and to coax the production code into rare but important states, such as timeouts, errors, and even specific interleavings of concurrent activities.
Using global variables in this way is safe only because go test
does
not normally run multiple tests concurrently.
Consider the packages net/url
, which provides a URL parser, and
net/http
, which provides a web server and HTTP client library.
As we might expect, the higher-level net/http
depends on the
lower-level net/url
.
However, one of the tests in net/url
is an example
demonstrating the interaction between URLs and the HTTP client
library.
In other words, a test of the lower-level package imports the
higher-level package.
Figure 11.1.
A test of net/url
depends on net/http
.
Declaring this test function in the net/url
package would
create a cycle in the package import graph, as depicted by the upwards
arrow in Figure 11.1,
but as we explained in Section 10.1,
the Go specification forbids import cycles.
We resolve the problem by declaring the test function in an
external test package, that is, in a file in the net/url
directory whose package declaration reads package url_test
.
The extra suffix _test
is a signal to go test
that it
should build an additional package containing just these files and
run its tests.
It may be helpful to think of this external test package as if it had
the import path net/url_test
,
but it cannot be imported under this or any other name.
Because external tests live in a separate package, they may import helper packages that also depend on the package being tested; an in-package test cannot do this. In terms of the design layers, the external test package is logically higher up than both of the packages it depends upon, as shown in Figure 11.2.
Figure 11.2. External test packages break dependency cycles.
By avoiding import cycles, external test packages allow tests, especially integration tests (which test the interaction of several components), to import other packages freely, exactly as an application would.
We can use the go list
tool
to summarize which Go source files in a
package directory are production code, in-package tests, and external
tests.
We’ll use the fmt
package as an example.
GoFiles
is the list of files that contain the production code;
these are the files
that go build
will include in your application:
$ go list -f={{.GoFiles}} fmt [doc.go format.go print.go scan.go]
TestGoFiles
is the list of files that also belong to the fmt
package, but
these files, whose names all end in _test.go
, are included only
when building tests:
$ go list -f={{.TestGoFiles}} fmt [export_test.go]
The package’s tests would usually reside in these files, though
unusually fmt
has none; we’ll explain the purpose of
export_test.go
in a moment.
XTestGoFiles
is the list of files that constitute the external test package,
fmt_test
, so these files must import the fmt
package in
order to use it.
Again, they are included only during testing:
$ go list -f={{.XTestGoFiles}} fmt [fmt_test.go scan_test.go stringer_test.go]
Sometimes an external test package may need privileged access to
the internals of the package under test, if for example a white-box
test must live in a separate package to avoid an import cycle.
In such cases, we use a trick: we add declarations to an in-package
_test.go
file to expose the necessary internals to the
external test.
This file thus offers the test a “back door” to the package.
If the source file exists only for this purpose and contains no tests
itself, it is often called export_test.go
.
For example, the implementation of the fmt
package needs the
functionality of unicode.IsSpace
as part of fmt.Scanf
.
To avoid creating an undesirable dependency, fmt
does not import the
unicode
package and its large tables of data; instead, it
contains a simpler implementation, which it calls isSpace
.
To ensure that the behaviors of fmt.isSpace
and
unicode.IsSpace
do not drift apart, fmt
prudently
contains a test.
It is an external test, and thus it cannot access isSpace
directly, so
fmt
opens a back door to it by declaring an exported variable
that holds the internal isSpace
function.
This is the entirety of the fmt
package’s
export_test.go
file.
package fmt var IsSpace = isSpace
This test file defines no tests; it just declares the exported symbol
fmt.IsSpace
for use by the external test.
This trick can also be used whenever an external test needs to use some
of the techniques of white-box testing.
Many newcomers to Go are surprised by the minimalism of Go’s testing
framework. Other languages’ frameworks provide mechanisms for
identifying test functions (often using reflection or metadata), hooks
for performing “setup” and “teardown” operations before and after
the tests run, and libraries of utility functions for asserting common
predicates, comparing values, formatting error messages, and aborting
a failed test (often using exceptions). Although these mechanisms can
make tests very concise, the resulting tests often seem like they
are written in a foreign language. Furthermore, although they may
report PASS
or FAIL
correctly, their manner may be unfriendly to
the unfortunate maintainer, with cryptic failure messages like
"assert: 0 == 1"
or page after page of stack traces.
Go’s attitude to testing stands in stark contrast. It expects test authors to do most of this work themselves, defining functions to avoid repetition, just as they would for ordinary programs. The process of testing is not one of rote form filling; a test has a user interface too, albeit one whose only users are also its maintainers. A good test does not explode on failure but prints a clear and succinct description of the symptom of the problem, and perhaps other relevant facts about the context. Ideally, the maintainer should not need to read the source code to decipher a test failure. A good test should not give up after one failure but should try to report several errors in a single run, since the pattern of failures may itself be revealing.
The assertion function below compares two values, constructs a generic error message, and stops the program. It’s easy to use and it’s correct, but when it fails, the error message is almost useless. It does not solve the hard problem of providing a good user interface.
import ( "fmt" "strings" "testing" ) // A poor assertion function. func assertEqual(x, y int) { if x != y { panic(fmt.Sprintf("%d != %d", x, y)) } } func TestSplit(t *testing.T) { words := strings.Split("a:b:c", ":") assertEqual(len(words), 3) // ... }
In this sense, assertion functions suffer from premature abstraction: by treating the failure of this particular test as a mere difference of two integers, we forfeit the opportunity to provide meaningful context. We can provide a better message by starting from the concrete details, as in the example below. Only once repetitive patterns emerge in a given test suite is it time to introduce abstractions.
func TestSplit(t *testing.T) { s, sep := "a:b:c", ":" words := strings.Split(s, sep) if got, want := len(words), 3; got != want { t.Errorf("Split(%q, %q) returned %d words, want %d", s, sep, got, want) } // ... }
Now the test reports the function that was called, its inputs, and
the significance of the result; it explicitly identifies the actual
value and the expectation; and it continues to execute even if this
assertion should fail. Once we’ve written a test like this, the
natural next step is often not to define a function to replace the
entire if
statement, but to execute the test in a loop in which s
,
sep
, and want
vary,
like the table-driven test of IsPalindrome
.
The previous example didn’t need any utility functions, but of
course that shouldn’t stop us from introducing functions when they
help make the code simpler.
(We’ll look at one such utility function, reflect.DeepEqual
, in
Section 13.3.)
The key to a good test is to start by
implementing the concrete behavior that you want and only then use
functions to simplify the code and eliminate repetition. Best results
are rarely obtained by starting with a library of abstract, generic
testing functions.
Exercise 11.5:
Extend TestSplit
to use a table of inputs and expected outputs.
An application that often fails when it encounters new but valid inputs is called buggy; a test that spuriously fails when a sound change was made to the program is called brittle. Just as a buggy program frustrates its users, a brittle test exasperates its maintainers. The most brittle tests, which fail for almost any change to the production code, good or bad, are sometimes called change detector or status quo tests, and the time spent dealing with them can quickly deplete any benefit they once seemed to provide.
When a function under test produces a complex output such as a long string, an elaborate data structure, or a file, it’s tempting to check that the output is exactly equal to some “golden” value that was expected when the test was written. But as the program evolves, parts of the output will likely change, probably in good ways, but change nonetheless. And it’s not just the output; functions with complex inputs often break because the input used in a test is no longer valid.
The easiest way to avoid brittle tests is to check only the properties you care about. Test your program’s simpler and more stable interfaces in preference to its internal functions. Be selective in your assertions. Don’t check for exact string matches, for example, but look for relevant substrings that will remain unchanged as the program evolves. It’s often worth writing a substantial function to distill a complex output down to its essence so that assertions will be reliable. Even though that may seem like a lot of up-front effort, it can pay for itself quickly in time that would otherwise be spent fixing spuriously failing tests.