The Elements of Style

William Strunk Jr. would have written beautiful code.

The Elements of Style, by Strunk & White, is a style guide for English prose, widely used in writing classes in the US but less known elsewhere. I was introduced to it many years ago when I was attempting to become a biologist: my PhD supervisor, in despair at the inscrutability of typical scientific writing, would implore all his students to read it.

Reading it again recently, I was struck by how applicable it is to programming. Sure, writing clearly is a useful skill for programmers, since all of us find ourselves doing technical writing from time to time. But I mean that it was striking how much of what makes good prose also makes good code.

Caution

Notwithstanding the qualities of the book, a word of warning is needed. Some of the book’s prescriptions, especially the most specific, are idiosyncratic (or, depending whom you believe, wrong.) They should be read, not as laws of grammar, but as opinionated style advice. They do not dispense with the need to apply one’s own judgement.

Clarity and concision

Every guide to good style preaches clarity and concision. Ernest Gower’s The Complete Plain Words springs to mind as an example in British English. But none puts its own advice into practice so completely as The Elements of Style.

Like The Pragmatic Programmer’s “Don’t repeat yourself”, the book’s maxim is just three words long: “Omit needless words.”

Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts. This requires not that the writer make all his sentences short, or that he avoid all detail and treat his subjects only in outline, but that every word tell.

One could add, “a software program no unnecessary lines of code.”

Further parallels

Several other, more specific, items of advice have obvious parallels in programming style:

Abbreviations and acronyms

Do not take shortcuts at the cost of clarity. ...Write things out. Not everyone knows that MADD means Mothers Against Drunk Driving, and even if everyone did, there are babies being born every minute who will someday encounter the name for the first time. They deserve to see the words, not simply the initials... Many shortcuts are self-defeating: they waste the reader's time instead of conserving it (p80, my emphasis).

Compare with Steve McConnell’s advice from Code Complete:

The most important consideration in naming a variable is that the name fully and accurately describe the entity the variable represents. An effective technique for coming up with a good name is to state in words what the variable represents. Often that statement itself is the best variable name. It's easy to read because it doesn't contain cryptic abbreviations, and it's unambiguous. (2nd edition, p260, my emphasis)

Negatives

Strunk & White:

Put statements in positive form. Avoid ... the weakness inherent in the word not... [T]he reader is dissatisfied with being told only what is not; the reader wishes to be told what is... [I]t is better to express even a negative in positive form. [For example:] “forgot” [is better than] “did not remember.” (p20-21)

Programmers soon learn the risks of cumulating negatives in boolean statements. Expressing negatives in positive form simplifies such situations, as recommended here:

Double negations (or worse) should be avoided. To help avoid double negations, boolean methods should be given positive names such as legalMove or gameOver, not negative ones such as illegalMove or gameNotOver.

Revise, rewrite, refactor

Two suggestions that call to mind refactoring:

Revise and rewrite. (p72)
Clarity, clarity, clarity. When you become hopelessly mired in a sentence, it is best to start fresh... Usually what is wrong is that the construction has become too involved at some point; the sentence needs to be broken apart and replaced by two or more shorter sentences. (p79)

Exactly the same could be said about writing method bodies. Cf Martin Fowler’s Refactoring, which is all about revising and rewriting code, why it should be done, and how to go about it.

Here are some more of Strunk & White’s recommendations that could equally well be taken as advice on composing and decomposing methods:

[R]emember that paragraphing calls for a good eye as well as a logical mind. Enormous blocks of print look formidable to readers, who are often reluctant to tackle them. Therefore, breaking long paragraphs in two, even if it is not necessary to do so for sense, meaning, or logical development, is often a visual help... Moderation and a sense of order should be the main considerations in paragraphing.

Keep related words together... The writer must... bring together the words and groups of words that are related in thought and keep apart those that are not so related. (p28)

Express coordinate ideas in similar form. This principal, that of parallel construction, requires that expressions similar in content and function be outwardly similar. The likeness of form enables the reader to recognize more readily the likeness of content and function. (p26)

Paragraph composition uses the same skills as breaking up long methods into understandable chunks (cf the Long Method smell in Refactoring, p76) - though methods benefit from a name, as though one were to place a subheading on every paragraph.

Cohesion

Keeping related words together (above) is closely related to the idea of cohesion in routines (Code Complete, p168), and to the “Single Level of Abstraction Principle” (SLAP). The first says that a routine should do one thing and one thing only. The second states that the lines of code in a method should all be expressed at the same level of abstraction: you shouldn’t mix, for example, lines expressing a business rule with lines having a purely technical significance like writing to a file. Rather you should refactor the lower-level code into its own method and give the method a name that is at the appropriate level of abstraction for the code from which you are calling it.

“Express coordinate ideas in similar form” crops up in Kent Beck’s Implementation Patterns (p15) as “symmetry”. (If you’ll excuse an off-topic hat-tip, it’s also reminiscent of Christopher Alexander’s principle of Alternating Repetition from The Nature of Order; Alexander’s work concerns architecture but has much influenced computer science.)

The underlying principle

Of far less obvious relevance to programming are the chapters dealing with such matters as the placement of commas, or the difference of meaning between “nauseous” and “nauseating.” And yet, and yet… What guides every single one of these recommendations – even the most whimsical – is an underlying principle: to bring the form of what you write as close as possible to its meaning. In programming, this principle is not merely relevant; it is critical. It is the foundation of comprehensibility. For this reason, I think that the most valuable aspect of the book is not its variously-reliable prescriptions on points of grammar, nor its excellent style advice, nor even the model of crisp writing that it provides. Its most valuable lesson is the adoption of clarity, precision and brevity as ideals to aim for. Infusing oneself with this spirit can do only good to one’s code, and it is for that reason that I commend the book to you, my fellow programmers.