Throughout this book, I’ve been looking for the blend of rules and rule-breaking that come together to make writing work—or rather to make writing excel. It’s an odd mix of consistency and the unexpected, of simple communication and whimsical delight, but it’s what we find driving our best fiction forward.
I grew up on Roald Dahl books. Charlie and the Chocolate Factory, The BFG, Matilda, and The Twits were some of the most popular books in my elementary school despite the fact that we were reading them forty years after they were penned. But as I was in the midst of researching and writing this book there was one forgotten Dahl short story that I found myself rereading over and over: “The Great Automatic Grammatizator.”
An engineer named Adolph Knipe dreams of writing stories that people will read. Knipe looks at the beginning of his latest failed novel attempt, which begins, of course, “The night was dark and stormy.” Then he has a eureka moment.
. . . he was struck by a powerful but simple little truth, and it was this: That English grammar is governed by rules that are almost mathematical in their strictness! Given the words, and given the sense of what is to be said, then there is only one correct order in which those words can be arranged.
He invents a machine he calls the Great Automatic Grammatizator. It takes in plots and can spit out a finished story. Knipe starts with short stories. Soon he dials up the complexity to the length of the novel and, getting more daring, he programs it to write a “high class intelligent book.” Knipe harnesses his machine to the point where “one half of all novels” published in English are written by the Great Automatic Grammatizator.
Now a tycoon due to the power of his bestseller machine, he forces other writers to the brink of starvation. The narrator of this story is not the engineer Knipe but an opposing author. Knipe offers the narrator a contract not to write. This would allow the narrator to eat, but the automated computer-generated stories would take over. Or, the narrator can decide not to sign the contract, which would allow him to write but leave him broke. The story ends with a plea from the narrator: “Give us strength, Oh Lord, to let our children starve.”
Like “The Great Automatic Grammatizator,” my book has been about the marriage of numbers and words. People often have polarizing reactions when objective analysis is applied to art. As I’ve discussed this book, I’ve encountered two opposite camps, which I’ve categorized in my head as the extreme skeptic and the doomsdayer.
The extreme skeptic is uneasy every time they see a number next to a word. Writing is an art, not a science, so how can math provide any substance?
If you’ve made it this far in the book I hope I have convinced you not to be that extreme skeptic. I’ve tried to tackle questions that are common to readers and writers. There’s a distinct benefit to being able to run through millions of words at once. You may lose a word’s impact on a particular page, but a new appreciation for an author can come to light. Patterns that are spread out over a corpus of literature, too large to be consumed by any one reader, can teach new trends, ideas, techniques, and wisdom that would otherwise be hidden.
In contrast to the skeptic, there is the person yelling that the sky is falling whenever they see anything to do with numbers and texts. If numbers can help us predict what will be popular to read, when will an algorithm just start writing novels for us? This is “The Great Automatic Grammatizator” distilled.
Even today, more than sixty years after “The Great Automatic Grammatizator” was published, the concept is far-fetched science fiction. The numbers I’ve looked at here, and the calculations I’ve used, can help us read and see patterns—but they can’t help us know when to break them. The questions I sought to answer in my book are primitive. Are there words worth avoiding? How do popular authors use certain words? What are the most substantial differences in the way people from different backgrounds write?
These questions are only a starting point for writers or readers—not an attempt to “engineer” art as much as a way to understand it or describe it. If you were an aspiring painter in 1900 you might want to know the specific paints and techniques that Monet was starting to use. If you were a band in the 1960s you would want to know how the Beatles were recording their songs. In either case, you would want to understand the craft in detail and on a technical level before making your own masterpiece. Reading a book is the easiest way to understand how a novel is crafted. Examining the patterns of thousands of books is going to answer different questions, but it is likewise a useful way to understand how books are truly crafted.
Somewhere between the skeptic and doomsdayer is where I hope you have landed after reading this book. Successful writers pen hundreds of thousands of words in their lifetime. In any other field with hundreds of thousands of data points it would be quite clear that the information could be mined to examine human behavior and psychology. I believe the same is true for examining words.
When Frederick Mosteller and David Wallace used equations to determine the authorship of The Federalist Papers, they were solving one small question about writing. It was a question with a definite answer, which made it simpler, but it showed that information that may not be obvious on first read is right there, hiding in plain sight.
The written word and the world of numbers should not be kept apart. It’s possible to be a lover of both. Through the union of writing and math there is so much to learn about the books we love and the writers we admire. And by looking at the patterns, we can appreciate that beautiful moment where the pattern breaks, and where a brilliant new idea bursts into the world.