During the first problem-solving session in our workshop, I witnessed a rare event. One of the students—very bright and very vocal—worked out an exact, clearly stated solution to the problem at hand. Given the circumstances of the class, such an exact solution is rarely found, and even more rarely communicated.
The solver—let's call him Mack—was working as part of a "division"—one half of the class competing with the other half of the class. He perceived how his solution could be applied, by coordinated action of the two divisions, "to achieve the maximum number of points." After some discussion, he convinced a few members of his division he might possibly have the key to their problem. They indicated that if he could convince the other division to go along with his idea, they would follow.
What was rare about this event was not that Mack thought he had a complete solution. Many students think that about each simulation. The unusual thing was he actually did have a solution. Moreover, he could state his solution in precise mathematical terms. I winced as I watched him lay out his formula on the blackboard because he used APL notation. Some of the students understood APL, but others did not. Although they were all experienced programmers, and it would have been easy for Mack to explain his APL notation, he was so excited by his idea he didn't notice he had lost three-fourths of his audience.
As he continued, his audience raised other small questions about his idea, but he never really paused to answer them to everyone's satisfaction. Having constructed the problem, I naturally understood his solution, but I refused to give him or the others any sign concerning its correctness. In life, after all, there seldom is an omniscient judge sitting on the sidelines to say "hurrah" when you get the right answer.
Lacking a judge, Mack's solution had to stand or fall on his presentation. It fell. And with it fell Mack's credibility with the class, a credibility he unsuccessfully struggled to recover for the rest of the week. The more he lost credibility, the harder he tried to present his good ideas on each exercise. The harder he tried, the faster he went; and the faster he went, the less people listened.
I wanted to tell Mack something to help him restore himself to full acceptance by the class, but I was unable to conjure up an adequate image. After the class, however, a few of the students were sitting around exchanging war stories about the good old days. None of them had ever worked on a "naked" machine, without benefit of an operating system to handle I/O chores, so I tried to explain what fun it had been to program in that environment.
I was telling them about the PDP-1 I had used in the early 1960s to run psychological experiments. In order to write a magnetic tape that could be read on IBM equipment, I had to program each and every bit on each and every record going out of the machine. I had to write programs to compute the parity bit for each character, and the overall parity character for the entire record. That impressed them, but they were astounded when I explained I also had to program the timing of each character's passage to the tape.
In other words, to keep the correct IBM spacing between characters, my program had to take precisely the number of cycles needed to move the tape that distance. Moreover, at the end of the record, it had to delay an additional small amount to give the correct spacing for the parity character. Then it had to allow for writing half of the inter-record gap. At the start of the record, the other half of the gap would be written, and timed, by my code.
"What would happen," one of them asked, "if your timing was off?" I explained: if the characters didn't get sent fast enough, the record would be too stretched out for the IBM equipment to read, but that wasn't the question that bothered him. "What I meant was, what would happen if you sent out the characters too fast for the tape?"
As I searched for an explanation, the image of Mack came into my head. "Well," I said, "it would be just like Mack on Sunday night, trying to explain his solution to the entire class. The characters would go out one on top of the other, producing gibberish on the tape, but there would he no indication anything was wrong. The only time you'd know you'd overrun the output device was when you tried to recover the information on the record."
"Like Mack, when he couldn't understand why we didn't go along with his idea?"
"Precisely.'' I was sorry I couldn't have explained it as easily to Mack himself.
Mack, like so many computer programmers and analysts, had a very high I.Q. and knew many facts about computers and the rest of the world. He was fond of taking I.Q. tests, perhaps because it reassured him he was a highly talented person—in spite of the evidence he was getting from his colleagues, who never seemed to want to listen to him. Having a high I.Q. is like a CPU's having a terrific computing speed. It's a great asset in problem solving—as long as the problem doesn't involve a lot of input or output.
But when it's necessary to communicate with other people in order to convert the idea of a solution into an actual solution, such high internal speed commonly causes overrunning. Most of us have experienced being overrun at some times in our lives. Frequently, it was in school, where some professor believed the purpose of standing in front of the class was to demonstrate who in the room had the highest I.Q.
Another place we may be overrun is in books, especially if they are in subjects not quite in our own specialty. Albert Einstein once explained why this frequently happens:
Most science books said to be written for the layman seek more to impress the reader ("Awe-inspiring!" "How far we have progressed!" etc.), than to explain clearly the elementary aims and methods. After an intelligent layman has tried to read a couple of such books be becomes completely discouraged. His conclusion is: I am too feeble-minded and had better give up.
If the author of the book isn't an "authority," we may preserve our own egos by concluding the author, not us, is feebleminded. We often come to this conclusion when reading manuals or computer science journals. We also do so when we listen to someone, like Mack, whose communication skills are several orders of magnitude below his raw intelligence.
If Mack were to encounter a computer system unbalanced in the direction of CPU power, he would instantly know precisely what to do to remedy the situation. He certainly wouldn't spend time trying to make the programs run even faster—as he did when he was the system.
Mack needed—as so many bright young computer people need—to build up his input and output capabilities until they are in better balance with their environment. When writing or talking, Mack could reduce the quantity of output and consume some of his excess computing power in improving the quality of his output. I do not mean, however, that he should sit quietly between his outbursts polishing what his next outburst will be just as soon as those other people shut up.
One way you can trade computing power for improved output is to use a lot of power processing your input. We usually call that "listening." It's sometimes hard to know when someone is listening—rather than merely waiting to seize control of the conversation. But everyone knew Mack wasn't listening because he seldom allowed other people to finish what they were saying. Why bother letting you finish, he seemed to be saying, when I know exactly what you're going to say and am prepared to go one or two steps faster?
As Einstein suggests, perhaps Mack was merely trying to impress his classmates. If so, he still failed. After the first couple of days, few of them listened to him at all. They were about as impressed with him as I was with the PDP-1. It would have been a great machine—if I hadn't known any better.
"If you want someone to like you," says an old Russian proverb, "let them do you a favor. If you want them to hate you, do them a favor." A similar wisdom applies to impressing people with your intelligence: If you want someone to think you're smart, listen and understand what they say. If you want them to think you're stupid, keep interrupting them with your great ideas.
Anyway, I thought you'd all like to interrupt what you were doing to read these great ideas of mine.
A number of people, some as authoritative as Edsger Dijkstra, have asserted the single most important asset programmers can have is ability in their native language—reading and writing. Although I don't believe anything is the single most important asset a programmer can have, I do share this high opinion of the value of language ability.
Unfortunately, there aren't many publishers who think the same thing about programming authors. Perhaps good writers would be more attracted to the programming field if our textbooks seemed to place more importance on clarity. Perhaps more working programmers would concentrate on improving their own clarity if they had good role models among our authors.
Writing in English and writing in, say, COBOL are similar activities—but not because COBOL is "like English." The secret key to most good writing is re-writing. It's true in English and it's true in COBOL. It's even true in APL and LISP. Nobody but a genius is capable of writing perfect prose on first draft. What a writer or a programmer needs is what Ernest Hemingway called "a built-in shit detector."
Let's see what Hemingway meant. Try reading the following paragraph. Then, if you like, try rewriting it using only half the words:
As we know all COBOL programs have four divisions. The first three—the identification, environment, and data divisions—specify various aspects of the program but do not describe any of the processing that is to take place. The actual instructions that specify what processing steps are to take place during the execution of the program are included in the procedure division. It is only through instructions in the procedure division that the programmer can communicate to the computer the operations that are to be performed.
Rewritten with no more than half the words:
Of the four divisions in a COBOL program, the first three—identification, environment, and data—do not describe any processing. The actual instructions specifying processing steps are found in the procedure division, and only in the procedure division.
I hope everyone agrees the second one is worth the five minutes it took to construct it. Look at it this way: If a book is published containing the first example instead of the second, 12,000 readers are going to waste one minute apiece. That's 200 hours saved for a five-minute investment. And that's not counting any hours wasted by misunderstanding a technical matter.
How do we decide when it's worth optimizing a program? The key question is this: How much time will be spent running this program in the future?
And how do we decide when it's worth rewriting some text? The key question is this: How much time will be spent reading this text in the future?
In the old days, we used to rewrite assembly language programs a dozen times to save three machine cycles. We usually don't have to do that now, because now the major cost is for people, not machines. Well, how many "people cycles" are we going to save by rewriting such things as the following "explanatory" comment:
THE FOREGOING DEVICE WAS NECESSARY IN ORDER TO PERMIT THE SAME PROGRAMMING TO CALCULATE THE STATIONARY AND THE STABLE POPULATIONS. ON THE FIRST ROUND THE GIVEN VALUE OF R WAS ENTERED. ON THE SECOND ROUND THE VALUE IN R WAS SAVED IN RR, R WAS SET EQUAL TO ZERO, AND THE CONTROL TRANSFERRED TO 15. THE PROGRAM RECOGNIZED THAT IT WAS ON THE SECOND ROUND BY FINDING ZERO IN R, AND IT THEN TRANSFERRED TO 23, HAVING PUT THE STABLE POPULATION IN THE ARRAY VKKA AND THE STATIONARY IN VKK, FROM WHERE THE STATIONARY IS TRANSFERRED IN THE LOOP AT 25 TO VLL.
I am sparing you the pain of trying to read the program this comment was attempting to explain. I'm sure you can imagine how much time a maintenance programmer will have to spend trying to understand it.
Don't fall into the trap of rewriting such comments. As long as the bull lives, there will be more such B.S. Comments as confused and confusing as this are not the disease. They are merely symptoms. If we cut open the code, we'd find a mess so repulsive we couldn't print it in a respectable publication. That's where the detector tells us to rewrite.
There are many symptoms signaling the need for rewriting a program, a comment, or a description. Most of them fall under the heading of "efficiency": If it takes more time to understand than it took to write, then rewrite or throw it away and start over. Try this efficiency rule on the following gem:
In order to remove the time-consuming procedure of the program which initiated an I/O operation having to check periodically to see whether the operation has been completed or not, as well as to improve the overall performance efficiency of a computer, the interrupt facility was introduced.
If you're typical of the readership of the manual from which this was taken, you'll probably have to read this item several times to untangle its incredible subject. By the time you've done that, you've undoubtedly spent more time on it than the author did. In about the same amount of time, you can recast it into something a person can comprehend in a single reading:
There are two principal reasons for a computer's having an interrupt facility:
(1) to eliminate periodic program checking for termination of a parallel process
(2) to improve overall performance efficiency.
Of course, when you rewrite so the sentence can be understood, you may discover the meaning was not correct in the first place. In this case, for example, we might not agree with this analysis of the principal reasons for the interrupt facility, or we might think the second point is merely a repetition of the first.
Incorrectness surely signals the need for a rewrite. Sometimes the error is a direct result of poor writing, as in the following:
However, like human memory, data can be placed in the computer's internal storage and then can be recalled at some time in the future.
The writer of this manual probably didn't intend to say data were like human memory but, rather, the computer's internal storage was like human memory. It probably would have been best to leave the mysteries of human memory to a physiology textbook, and rewrite simply as:
The key property characterizing a computer's storage medium is the ability to accept data at one time and to make those data available for recall at future times.
In programs, as in English text, indirectness or roundabout methods of expression usually indicate the need to rewrite. In programs we see such symptoms of indirectness as the following:
1. Introduction of superfluous temporary data elements
2. Loops with awkward exceptions or complex termination conditions
3. Moving and re-moving the same data items
4. Duplicated statements
5. Excessive housekeeping at start or finish
In natural English, we see similar signs, such as the following:
1. excessive use of pronouns
2. repetitious prose
3. difficulty of getting started
4. difficulty finishing.
In English, though, one particular syntactic structure almost always indicates excessive indirectness:
5. passive voice.
We could have spotted several of the troublesome examples above merely by observing their passivity. Sometimes, however, the writer is intentionally passive, as when writing specifications or sales documents without committing the writer to anything. Bob Finkenaur of Northeastern University contributed the following example.
Preparation H is a popular American over-the-counter remedy favored by such professionals as programmers and analysts who spend much of their day exercising their posteriors. A popular commercial for Preparation H states:
"Preparation H is doctor-tested and helps to give prompt temporary relief from the discomfort of hemorrhoidal tissues."
To the sufferer, this ad makes the product sound terrific, but on close analysis, we see that Preparation H
1. is tested by doctors, not necessarily approved
2. helps, but doesn't work alone
3. is prompt, but not immediate
4. works on the discomfort, not on the physical condition itself
5. gives relief, not cure
6. is temporary, not permanent
Next time you analyze a hardware or software proposal, try subjecting it to the Preparation H Test. In fact, why don't we put all the following tests together and call them the Preparation H Test? Whenever you read something intended for other eyes, ask yourself:
1. Is it more trouble to read than to rewrite?
2. Is it correct?
3. Is it misleading?
4. Does it give me a pain only Preparation H can relieve?
If the answer to any one of these is yes, rewrite it. Or scrap it. You may hurt the sales of Preparation H, but otherwise the world will love you for it.
Don't, however, apply the test to this book. Any problems here are undoubtedly due to poor editing or typographical errors. And passive voice is never used. Right?
"Then you should say what you mean," the March Hare went on.
"I do," Alice hastily replied, "at least—at least I mean what I say—that's the same thing, you know."
"Not the same thing a bit!" said the Hatter. "Why, you might just as well say that 'I see what I eat' is the same thing as 'I eat what I see' !"—Lewis Carroll Alice's Adventures in Wonderland
Alice was the prototypical programmer. So often, when reading code, I'm reminded of Alice's naive remarks at the Mad Hatter's tea party. Few programmers understand the difference between saying what they mean and meaning what they say. For instance, I once found this code in a PL/I program:
I = 1;
XYZ: A(I) = B(I) ;
I = I + 1;
IF I < 21 THEN GO TO XYZ ;
What does this code mean? Indeed, what does it mean to ask "What does this code mean?" One obvious meaning of a piece of code is the instructions it causes to be executed—its meaning to the computer. But far more important is the meaning to some other person who encounters this code for the first time and tries to understand it.
I'm particularly interested in the idea of how long it takes to understand a piece of code. This question is important for teaching, but much more important for the ever-expanding job of maintenance. In its lifetime, a piece of code is written once and read perhaps hundreds of times. Therefore, if it doesn't say what it means, and only what it means, it's going to add up to years of labor wasted in misunderstandings.
The above code was hard to understand for many reasons, most of which can be discussed under the query "Why wasn't it written some other way?" Alternatively, "If it meant something else, why didn't it say something else?"
For instance, the code seems to form a loop, with the variable I ranging from 1 to 20. But PL/I happens to contain a loop control structure, the DO, which was specifically designed for such loops, as in
DO I = 1 TO 20; A(I) = B(I); END;
If the programmer didn't choose to use this special form, the reader must ask, "Is there more to this than meets the eye?" Perhaps the label XYZ is used for some other purpose, such as a branch from outside the loop. That would justify this structure, for the language forbids us to branch into a DO. Or possibly there is some other reason, too subtle for us to perceive without help. But if, indeed, the programmer merely meant the same thing as the simpler DO loop, she has said more than she meant and thereby given us grief.
Even the DO-loop may say more than it means. What's the special significance of the number 20? Does the programmer really mean
DO I - 1 TO N; A(I) = B(I); END;
where N is the number of elements to be moved? In this form, the code says to move the elements of B numbered 1 through N to the corresponding elements of A, in order. That's pretty clear, but perhaps that, too, is more than was meant.
Possibly, what the programmer meant was
DO I = 1 TO HBOUND (A, 1);
A (I) = B (I); END;
which says move the number of elements in A from B to A, starting with element 1. In that case we don't have to search to discover what, if anything, was the special significance of 20, or of N. The program says less and means more.
But wait! What about the number 1? In PL/I, array subscripts don't have to start with 1, so perhaps the loop doesn't mean what we thought after all. It could be a trap for the unwary maintenance programmer, though it probably means "the number of elements in A." If that's what it was supposed to mean, then the programmer would have been kinder to write:
DO I = LBOUND (A, 1} TO HBOUND (A, 1);
A(I) = B(I); END ;
which leaves no doubt in the reader's mind that we are moving the number of elements in the array A, regardless of what its bounds might be.
But there are still questions. This loop could have been written
A = B;
Why wasn't it? The reader has the right to ask, "If the program is simply moving one array into another, why didn't the writer use array assignment?" Must we assume that the writer didn't mean what she said? And if we assume that in one place, what stops us from assuming it everywhere?
Even the simple statement
A = B;
leaves much room for unintended meaning. If we're compiling the program for parallel processing, are we allowed to move all the elements at the same time? PL/I's definition says no, for there is a definite order in which the elements of B must he moved to the elements of A. We may not care about the order, but then again we may—if there are possible interrupts. PL/I doesn't really provide the language here to say what we mean, which might be
A = B, IN NO PARTICULAR ORDER;
Also, it might seem that A and B are likely to be the same size, but that's not necessarily the case, considering only this statement. As long as B's bounds are contained within As bounds, the assignment is permitted, so perhaps the programmer meant that in general B may be smaller than A. To find out, we might have to look elsewhere, such as at the declarations of B and A.
In the original program, A and B were declared with the same dimensions, but in two separate statements that looked something like this:
DECLARE A(N) FIXED;
DECLARE B (20) FIXED. . .
N was declared outside the procedure containing this code but was assigned the value of 20 just before the procedure was called. It all fit together, but if any one element was changed, it sprung a trap on the unwary maintenance programmer.
Indeed, there was good reason to believe the value of N would change over time, in which case at least three places in the code would have to be changed to keep it current. Actually, A was a parameter array passed to the procedure, and its dimensions were set by the passed array. It could have been declared
DECLARE A (* ) FIXED;
But, then, what about B? When I studied the full declaration of B, I found it to be
DECLARE B(20) FIXED INITIAL ((20) 0) ;
which means it was to contain 20 zeros. Aha! it seems that B's only use was to set all the elements of A to zero upon entry to the procedure, which could have been expressed better by eliminating B altogether and writing
A = 0;
Expressed in this form, the code eliminates most of the other questions. It's not perfect, but see how much closer it comes to the ideal expressed by that greatest of all programmers, Humpty Dumpty "When I use a word, it means just what I choose it to mean—neither more nor less."
If we only had more Humpty Dumptys, perhaps we'd have less rinky-dink programming.
Although we imagine ourselves to be modern and free of superstitious nonsense, we do have some curious rituals. One of the most curious is based on most people having ten fingers. Over the years, this accident led our ancestors to develop a system of counting based on the number ten.
Because of this system, every tenth year ends in a zero, and the next-to-last digit changes. This change seems to catch some system designers by surprise, and several of my clients were caught short when 1979 turned into 1980. Many more, I suppose, went down the tubes when 1999 turned to 2000.
I doubt the number system will change. (If you don't believe the number system is likely to remain unchanged, consider the ancient Babylonian base-60 system. It's still around after thousands of years, in our reckoning of minutes and seconds and in our measurement of angles.) No, our decimal number system is rather familiar, and we're rather suspicious of the unfamiliar. Besides, most people—other than systems designers—have heard about the changing of the penultimate digit. Indeed, the common people actually anticipate this change with much excitement.
This magical significance of the number 10 carries over to the celebration of the anniversaries of significant events in our lives, such as births, marriages, and graduations. And, because 10 happens to be divisible by 5, some of the magic seems to rub off on every fifth anniversary as well. For instance, graduations from high school and college are ritually observed on every anniversary ending in 5 or 0 by holding the "class reunion." Difficult as these rituals are to bear, most people don't have to endure more than one in any given year, because in the American system most people take four years to go through college after they graduate from high school. Although tradition sometimes seems nonsense, at times it serves a purpose we don't notice. In this case, the number 5 saves us from too many embarrassing comparisons with fellow alumni in any one year.
Alas for me, I took five years to complete college, so every five years I have to contend with two reunions. I know I can't possibly survive actually attending these reunions, but I do feel I should do something to demonstrate my solidarity with the revered customs of our people.
I decided to spend fifteen minutes reflecting on what I learned in school—something unusual for a special reunion column. Unfortunately for me, I don't remember much about what I learned in school. It may be buried within me. or even woven within the very fibers of my being, but generally I can't extract a specific item and say, "This I learned in school." With one exception! Much as I hate to admit this, there is one course I took which I remember clearly, the lessons of which I use every day, and which I consider myself most fortunate to have taken.
No, it wasn't my first computer course, which I never took. I'm so ancient they didn't even have computer courses when I went to school. No, not some fundamental math course. (Yes, you cynics, math had been invented a few years before my time.) Not some study of the immortal works of literature. Certainly not the one psychology course I ever took, for I dropped it after twenty minutes of the first lecture. No, the course I remember with such gratitude is nothing so grand as these, but a simple, unassuming course called "Scientific Greek."
The Scientific Greek course had been invented by a kindly old gentleman in the classics department as a way of preserving his job in a time when classics had reached bottom. Each week we were given a list of Greek roots we had to memorize and regurgitate on a weekly examination. The words were spelled in English, so we didn't even have to master the Greek alphabet. (The fraternity forced me to master the alphabet by beating me with a paddle whenever I missed a letter.)
It's a bit frightening to me, in view of my opinions about education, to recall how this course used nothing in the way of either analytical or creative ability. The only requirement was memorization, pure and simple. The more words and roots you memorized, the higher your grade. True, the lessons on each root were surrounded by captivating anecdotes drawn from a lifetime of classical scholarship, but we didn't have to recall any anecdotes on the tests. True, the anecdotes might have helped us remember the roots, but though many of the roots remain in my mind to this day. I can't dredge up a single one of the stories.
The roots do come up all the time, and they're sometimes worth a fortune. More than once, in fact, they may have saved my life. Like the time I was laid out in a hospital bed with my neck swollen to size 40 from an infected tooth. An orderly wheeled in a cloth-covered cart. I heard metal instruments clinking under the cloth, so I managed to whisper, "What's that?" The attendant never even glanced in my direction, but automatically uttered these soothing words: "Oh, it's nothing. Just some stuff we need to do the tracheotomy."
As he wheeled and left the room, my scientific Greek bubbled up to my rescue. I'd never heard the word tracheotomy before, but I was immediately able to discern its meaning from its Greek roots. They intended to cut my throat!
Needless to say, forewarned was forearmed, and by the time the doctor arrived to do the dirty deed, I was ready to talk him out of it. And I did talk him out of it! I think he was caught off guard by my knowing what was about to happen. Routine medical procedure requires the patient be kept in a state of ignorance until the last possible moment. Then the patient is asked to sign an "informed consent"—a document giving the doctors several layers of legal protection against malpractice suits in case the surgery doesn't work out quite right. Or in case it's a success but the patient dies.
Because of my smattering of Greek, I had been able to prepare a number of questions to ask before I signed the consent form. Also because of my Greek, I was able to understand some of the answers to my questions, couched as they were in the otherwise secret language of medicine. Armed with these answers, I decided I'd rather risk not having my throat cut for a little while, no matter how convenient it might have been for the doctor's schedule to do it immediately. Happily, the swelling started to respond to antibiotic treatment, and my throat remains intact to this day.
Recently, I was sick for several months with a mysterious ailment no treatment seemed to abate. In desperation, my doctors put me in the hospital for tests and observation—a heartwarming place to spend the Christmas season. After five days of poking, probing, and prying, they seemed more baffled than ever. Two of them, charts in hand, held a mini-conference at the foot of my bed, and one said to the other, "Perhaps it's iatrogenic."
"I was beginning to think that myself," the other agreed solemnly.
"Quite likely iatrogenic."
I believe the conversation was supposed to sound sufficiently ominous to induce me to submit to more severe treatment. But before the doctors could proceed, my scientific Greek flew once more to my rescue.
"Well," I said, "if you doctors are the cause, perhaps you ought to stop treating me altogether and let me go home." They seemed a bit surprised I had understood them, so I maintained my advantage and pressed for early discharge. Once out, I changed doctors and subscribed to a medical service I pay for keeping me well, not curing my illnesses. From that moment, I felt better (until the cancer floored me). I don't think my cancer was iatrogenic, and my (new) doctors performed a marvelous cure, for which I'm truly grateful.)
How nice, you're saying, but what does all this have to do with computing? Nothing directly, I suppose, because we programmers don't speak scientific Greek. But we do speak a kind of Pig Latin (JCL, front-end-database-microprocessor, distributed-intelligence-cyclic-network-architecture, COBOL, SNOBOL, SPITBOL, APLGOL, DAMMITOL) serving many of the same purposes as the medicine man's Greek, Latin, and inscrutable handwriting.
And what purpose is that? Why, to conceal the programmogenic pathologies from our ignorant corps of clients. Just like the doctors, we need some way of concealing our mistakes other than covering them with soil and planting daisies. In medicine, iatrogenic means, literally, "born of the healing process"—in short, a disease caused by the doctors themselves. You've got to admit that iatrogenic sounds and smells a lot better. We've got to find a similar word for ourselves, but the Greeks didn't have any programmers, so what root could we use?
By the way, in case you're wondering why it took me five years to get through a four-year college program, I spent one year as the victim of another iatrogenic pathology. I didn't know it at the time—it was the year before I took the Scientific Greek course—but much later I figured it out. At the time, though, I not only didn't realize the doctors had made me sick, but I actually thought they had saved my life! I was so impressed with medicine, I took up pre-medicine for a year before the truth dawned on me and I returned to computers.
And I'm glad I returned. In our business, who ever heard of a client so ignorant of both Greek and computing he thought his tormentor was his savior? Surely, with our emphasis on training programmers for clear communication, such an outrage could never occur!
Once I've made up my mind, I love to hear arguments supporting my position. After I bought my diesel Rabbit, I began to collect articles extolling the virtues of diesels, Rabbits, and diesel Rabbits. After I lost a stone (6.35 kilograms or 14 pounds, for you Yanks) dieting, I consumed article after article extolling the virtues of slimness. And ever since I made up my mind that HIPO could safely be ignored, I've relished any and all evidence supporting the documentary value of simple program listings and English narrative texts.
Imagine my delight, then, when Computerworld carried this provocative headline:
Documentation study proves utility of program listings.
As I devoured the article, I kept waiting for the "proof," savoring every paragraph. The article first explained the setting of the study: a research and development center for the U.S. Marine Corps which performs software maintenance on "all real-time, tactical computer programs in the Corps." Then it listed the software documentation tools studied:
1. The Data Base Design Document (DBDD), which contains descriptions and pictorial representations of all program data structures
2. ANSI flowcharts (FLOW) as widely used [sic] in DP
3. Hierarchy diagrams (HIER), which show the calling hierarchy of a program similar to an organizational chart
4. Hierarchical-Input-Processing-Output charts (HIPO) as widely used [sic] in DP
5. Computer Program Listings (LIST), which at this installation provided interspersed source and object code, and also provided extensive cross-referencing and set/used information on all program elements.
Also on the same page, prominently displayed next to the headline, was the illustration shown here as Figure 3.
Figure 3. How Various Documentation Types "Rate" with Programmers
I studied the diagram before turning to the continuation, hoping perhaps here was the clue to the nature of the promised "proof." Then I noticed HIPO had a "score" of 0.0 and an asterisk meaning "variation from average scores on starred items are significant beyond 0.90 level." I began to worry.
Why worry? First of all, I have a chronic difficulty with the statistical word "significant." To a statistician, "significant at a 0.90 level" means something like this:
"If we did this same study 100 times, and there really weren't any differences among the variables, we'd get as strong a result about 10 times."
To a reader, it means something further, by omission. The tradition is to cite levels of "significance" of 0.90, 0.95, 0.98, 0.99 and perhaps 0.999. It's also understandable why experimenters cite the strongest level their statisticians will permit. So, if a 0.90 level is cited in the article, you can be sure a 0.95 level wasn't reached in the experiment.
There's nothing wrong, though, with this definition of "significant"—except it has nothing to do with what non-statisticians understand by the other word, "significant," which happens to be spelled with the same letters and pronounced in the same way. This other word means, according to my dictionary: "Having a meaning; meaningful; full of meaning; important; notable."
You see, it could be "important" or "notable" when HIPO has a score of 0.0 on a similar experiment no more than 10 times out of 100 if there weren't any differences among the variables. It could be, but there was nothing on page 33 to indicate it was. Page 33, in fact, said nothing about where these numbers came from.
So I worry about the 0.0. I happen to believe the use of HIPO is not cost-justified in most of the places it's used, but I've never said its value was 0.0. Indeed, anyone who's worked with HIPO, or watched others work with it, would recognize there is some value in the technique. The question I am asked as a consultant, however, is different: "Will the use of HIPO be worth what it will cost?"
To this question, I usually respond, "I don't know, but if you wish to experiment with HIPO, drop one other method first—one putting a comparable cost burden on the programmers." When my clients do as I say, they sometimes find HIPO is worthwhile. And sometimes dropping the other method was even more worthwhile.
In short, the 0.0 was not very believable—until I noticed the word "normalized" in the figure. Now I understood. This figure is what we alternatively call a "Gee Whiz Graph." "Normalizing" the scores means scaling down the lowest to zero, which has the effect of maximizing the appearance of difference.
We don't know, and the article doesn't say, what the original scores were. They could have been 3.0 for HIPO and 4.8 for LIST. On the other hand, they could have been 497.1 for HIPO and 498.9 for LIST. It may not matter statistically, but it sure matters to the people trying to understand the significance (importance) of the graph.
Needless to say, I was discouraged, but deep devotion to my own opinions kept me going long enough to turn to page 37 for the conclusion of the article. There I discovered these numbers were not the outcome of some experiment with documentation tools, but the "results of a questionnaire" given to eighteen programmers. An opinion poll!
These results, according to the authors, "show that LIST is clearly regarded as the superior tool" (my italics).
Regarded? My dictionary gives these three definitions of regard (among others):
1. To observe closely.
2. To look upon or consider in a particular way.
3. To have great affection or admiration for.
In short, though the headline writer inferred the first meaning of regard, the second or, even more, the third meaning seems a more appropriate choice in this case. It seems to me this study "proves" it's quite likely these eighteen programmers have more affection for LIST than they have for HIPO.
I want to be fair to the authors of the study. An essay may not provide the whole story. I'd guess, though, it's pretty close to the essence. And the essence is this: we're now to choose software tools the same way we choose cigarettes or deodorants.
I can see it now:
"Nine out of ten chief programmers prefer preprocessors."
"I've been a Fortran user ever since I started programming, and I'd rather fight than switch."
"Be a real programmer—don't use virtual machines!"
Come to think of it, haven't we always done it? Yes, but now we have statistics and experimental psychology to support our prejudices. Who says software engineering has no future?
My oldest son, Chris, attended classes at various universities for about ten years. At first he would cut his high school classes and visit the university to attend lectures on subjects he found interesting. Later, he dropped out of high school altogether and just started attending university courses without registering.
After a few years, Chris earned a high school diploma by examination and began his attempt to be a regular college student. Somehow the regularity of this position didn't fit well with Chris. Each semester, he would register for five courses and drop three of them. Sometimes he didn't bother to drop, but flunked by not attending class. Finally, he realized he was wasting a lot of money this way and he had no future in the university as long as he was a regular student.
Chris had always been interested in horticulture. He accepted a position on the university grounds staff allowing him to work outdoors, earn a little money, and even take a course or two at university expense. This new arrangement seemed to suit him. He got a nice suntan, good strong muscles, and almost every day he learned something. As far as I could tell, though, most of his learning came on his job, not from his two courses.
One day Chris began complaining about his boss, who had been on the grounds crew more than forty years and was close to retirement. I asked Chris if his boss was so good a worker they wanted to keep him on past retirement age. "No," Chris said, "he hardly works at all."
"Well, he must be an excellent supervisor."
"No," Chris said, "we do pretty much what we want. He's hardly ever watching us or even to be found when we're working."
"Well, does he have some kind of blackmail information over the university administration, so they're afraid to let him off the job?"
"No, that's not it either," said Chris, "although that's perhaps a little closer to the truth. The fact is, he's the only one who's been around so long that he knows where all the underground pipes are buried. He may not do anything for several months, then one day they're about to dig up some new area of the campus so they'll come to him and ask him to locate the buried pipes. He remembers them all, and this can save the university thousands of dollars in mistakes—like cutting into a water pipe when they're digging a trench. I'm sure he earns his salary many times over just because of the information he has in his head."
As a parent, I was unable to allow this lesson to pass without drawing a moral. I said to Chris in my most fatherly tone, "This is the Way of the World. You're paid more for what you know than for what you do. You could work a thousand times harder and never have the job security your boss has just from knowing where all the pipes are buried."
The situation is no different in computing. Unfortunately, management in many computing organizations doesn't seem to know most of the important documentation is being carried around in people's heads. Managers get very upset at the idea of paying someone whose only function is remembering how things were or why they are the way they are. Management wants to pay for lines of code or some other tangible sign of effort.
The conceptual problem managers have with this type of worker stems from their confusion between documentation and documents. In this respect, there's a very strict parallel between landscaping and programming. In landscaping, the basic documentation is in the ground. If your map says there's no pipe in a certain place and your shovel hits a pipe, you have to believe the shovel. The same is true if the map disagrees with old Fred's memory. Fred is more likely to be right than the map. But the shovel is the final judge.
In programming, the code is the ground. There may be lots of other paper documents lying about, but what they're lying about most of the time is the code. Every good maintenance programmer knows this and, indeed, rarely looks at the documents that are supposed to document the system under maintenance. When a question isn't answered directly out of the maintenance programmer's own head, the first source of information is old George who worked on this system a couple of years ago. Nine times out of ten, George can provide the information you need.
The other one time out of ten, George is likely to be able to answer the question, "Who else might know?" Perhaps he refers you to Sally, and Sally has nine chances out of ten of knowing the answer. Once in a while, George or Sally will refer you to some document, but that document is likely to have been superseded by some other document, which is more up-to-date but doesn't happen to have the information you want.
The previous description was not intended as an editorial comment, simply a description. Programming managers would do very well to study my description and then compare it with the process actually taking place in their own organizations. Management is a difficult business. It's made more difficult by managers who are afraid to start from the base of reality in thinking about what to do in their organization. If you would like to improve the documentation situation where you work, wouldn't it be a good idea to start with a clear picture of what the situation is right now?
The next step in a program of improved documentation is to see what can be done to assist the natural processes of documentation before trying to introduce artificial methods. For instance, since the code is the basic ground upon which all documentation is based, why not take steps to improve the quality of the code as documentation? The first step in such a code-improvement program is to be sure that no code goes in the library without having been read and understood by one or more people. If the code is a document, it stands to reason it can be tested for its documentation power only by being read. Once your organization starts reading code on a regular basis, many other ideas will spring to life about how to improve the code's readability.
While you're improving the code as document, you might also look around to see what can be done to improve the quality of individuals as documents. As a first step, you should notice this: When individuals leave an organization, they lose most of their usefulness, just as documents do. Therefore, any policies reducing employee turnover are likely to improve the quality of the living documentation.
To some extent, you might protect yourself from turnover by capturing some of this living documentation on a recording. Audiotape is fine, but videotape is even better. Why not have the designers on each part of the system take a few minutes in your company's video studio to record the thought process behind the design—the thought process lost to view a few months after coding begins? In a short time, you'll accumulate a nice library of video documents, which at the very least can be used to introduce new people to your projects. They'll not only see the thinking underlying the project, but they'll also be introduced to the personalities who are the living documents they will refer to when necessary.
If you don't have videotape, or even if you do, you might want to create other forms of indexing to guide people with questions to people with answers. For instance, each piece of code might contain a prologue cumulatively listing all the people who ever laid a hand on the code. Another method often used is to have each line of code contain the initials of the last person who worked on it. That way, if the maintenance programmer is having some difficulty with a particular line, there's no trouble locating the person who has the most recent direct knowledge.
Even your regular paper documents should contain an adaptive index to the people who worked on or used them. At the university, there are maps of the grounds dating back seventy years. These maps are useless without the learned commentary of one of the old timers who remembers an error in this map when they used it in 1957. And sure enough, they never did correct the error even though they did the right thing underground. Indeed, the more imposing the document scheme, the less likely it is to be updated by the ordinary mortals who have to use it. No groundskeeper is going to place dirty, ugly handwriting on the beautiful drawing made by the architect in 1940.
In the same way, no programmer is going to despoil the documents left by the designers of our holy system. But any programmer who tries to use the documents will remember vividly how long it took to understand a certain paragraph. The programmer will be more than happy to help the poor novice who's now trying to understand the incomprehensible paragraph. If you provide a place on each document for people to initial when they've read and understood it, later readers will have a reference to those people who are most likely to be of assistance.
I could give many more examples of how you could enhance your own living documentation system, but it's better if you know and study your own system and develop your own suggestions. Only then will my suggestions be completely relevant to your own problems. I've gotten any ideas I have on the subject from organizations who have done just that—studied their own informal documentation system and looked for ways to help it along.
So let that be a lesson to all of you who say that the university never teaches anything. The university teaches many things. If you recall your own university years—assuming you had such years—and if you're honest with yourself, you'll realize you, too, learned many things at the university. Possibly, you didn't know what you'd learned because you were concentrating on the learning taking place in the formal part of the university—just as you concentrate on the formal part of your documentation schemes. Most of the learning at the university takes place outside the classroom, just as most of the documentation in the programming shop takes place outside the formal system of documents.
One Tuesday, a mouse was exploring the kitchen when he happened to get on top of the ironing board. On the board was an electric iron whose face was so shiny he could see his reflection in it. As the iron was tilted a bit backward, his reflection seemed to him to be another mouse who was a bit shorter than himself. As he was a lonely mouse, he took the size of the reflection to mean this new mouse was a girl.
Shyly, he moved a bit closer to the stranger. With equal shyness, she advanced toward him. Cautiously, he smiled. Just as cautiously, she smiled back. As can be imagined with such a responsive partner, he quickly fell head over heels in love.
They sat for a while, gazing into each other's eyes in loving rapture, until the lady of the house opened the kitchen door. "Run," he commanded his sweetheart, and they scampered away from the iron. Glancing back for a moment, he saw that she was indeed running away, but in the other direction. "She must live over in that neighborhood," he thought.
All the next week he looked for her on the opposite side of the kitchen, but she was nowhere to be found. On Tuesday, however, the ironing board was set up again and the iron was put on top and plugged in. As soon as the lady of the house left the room, he scooted up on the board. Sure enough, there in the iron was his true love. As he ran up to her, he could not conceal his emotions. He saw she could not conceal hers either. He reached out to her, and she reached out to him. They touched paws. They kissed. He was overcome with emotion, though enough in control of his senses to see she was likewise overcome. He kissed her again, and it seemed her kiss was growing warmer.
Again and again he kissed her. Now her growing warmth was unmistakable. Indeed, he was becoming so warm he had to move away from her a bit to cool off. He smiled and told her things about himself. Though she did not answer (he liked women who were good listeners), because of her smiles, he felt she was responding to what he said. At last, he could resist her charms no longer, and he rushed forward to kiss her once again.
By this time, of course, the iron had reached its full heat. "Eeeow!" he cried, jumping back with burned mouth and paws. "Why did you do that?" But before she could reply, the lady of the house came in, so both mice had to run away.
All week he brooded about what he had done to offend his sweetheart. Perhaps he had been too forward. Or, perhaps, not forward enough. On the other hand, he had spent a lot of time talking about himself. Perhaps she found his ego offensive. "Next time," he resolved, "I will let her tell me about herself. I will beg for forgiveness."
Next Tuesday, the ironing board came out again. He rushed to see his love, and his heart was pounding for fear she would not come. But she was there, and as the iron was not plugged in, she received him in a friendly, though somewhat cool, manner. They passed a pleasant hour, kissing and holding paws, and not a word was said about their previous trouble. Before he had a chance to bring it up, the kitchen door opened, and they had to part for another week.
For the mouse, it proved another week of brooding. Delighted as he was she had taken him back, he could still recall the coolness which she had not shown before. He finally decided he must be very careful on their next meeting.
That meeting was delayed for more than an hour, because the lady of the house stayed in the kitchen while the iron was warming and did not leave until the ironing was finished and the iron unplugged. No sooner was she out of the room than the mouse leapt up on the ironing board and raced into the outstretched arms of his darling. "Eeeow!" he screamed as he ran right into the searing iron. "Why did you do that?"
He questioned her, pleaded with her, even confessed to her all of his shortcomings. Despite his pleadings, she would not tell him what he had done wrong. At last, however, he began to detect a change in her attitude, even though she had not said anything. "Perhaps," he wondered, "she has forgiven me, seeing that I have been punished enough." He moved forward to her and, sure enough, she returned his kiss and embrace, with all the warmth of their second meeting.
And so it went from Tuesday to Tuesday, sometimes hot and sometimes cold. The poor mouse was soon covered with scars from being branded by the iron. Worse than that, he became preoccupied trying to understand his girlfriend's behavior. He began missing meals and losing weight. Finally, one Tuesday, right after seeing her, he was so distracted and hungry he blundered into a mousetrap, which ended the misery of his ill-starred love.
Moral: If you are foolish enough to suppose the iron turns hot and cold for you, you are foolish enough to get burned.
In other words: Communication always involves two people, not one plus iron.