Post

Limit line length

Want a codebase that’s easier to read, maintain, and extend? Enforce a line length limit.

It’s simple, it’s easy, and it works for every language - even ones that haven’t been invented yet. And enforcement doesn’t require any special tools. The venerable awk can do it in one line.

1
length >= 80 { print FILENAME ":" FNR ":" length; code = 1 } END { exit code }

Here’s what limiting line length can do for you.

Flatter code

Excessive nesting is one of the fastest ways to make a codebase inscrutable. Programmers know this, but in the code-writing process, they end up piling on layers of context almost effortlessly. And in the moment, it really is effortless: it’s easy to do because it’s fresh in their mind.

But context-heavy code is difficult to return to. Not only that, it’s difficult to wrest from the context that spawned it, should it find itself applicable elsewhere.

Line length limits provide an organic deterrent to nesting: the more deeply nested the program currently is, the less space the developer has to write the next statement. Within just a few levels of context, developers find themselves encouraged to break out chunks of code into separate functions.

Named intermediates

There is a temptation among developers to get as much work done in a single line of code as possible. It’s fun to write, fun to read, and best of all, a nightmare to debug.

Sadly, the joys of recreational obfuscation do not make up for the damage it does to the readability of a codebase. Limiting line lengths naturally curtails this behavior, forcing developers to name intermediate values when writing complex logic or long chains of operations. These named intermediates serve as complexity checkpoints, making it easier to understand what the code is doing.

1
2
3
4
5
6
7
# A long line with no intermediate values.
ip = response['Reservations'][0]['Instances'][0]['NetworkInterfaces'][0]['Association']['PublicIp'] # nolimit

# The same operation with named intermediate values.
instance = response['Reservations'][0]['Instances'][0]
interface = instance['NetworkInterfaces'][0]
ip = interface['Association']['PublicIp']

A length limit also catches overly-long if statements, encouraging decomposition into discrete, labeled chunks.

1
2
3
4
5
6
7
8
9
10
11
12
13
// A long line with a complex condition.
if admins[user] || post.Author == user || slices.Contains(user.Permissions, "managePosts") { // nolimit
    post.Delete()
}

// The same operation with part of the condition factored out.
if admins[user] || user.postManager(post)
    post.Delete()
}

func (u User) postManager(p post) bool {
    return p.Author == u || slices.Contains(u.Permissions, "managePosts")
}

Increased specificity

Variables, classes, functions, and packages shouldn’t have long names. Most things can be reasonably described with only one or two words. Using a phrase in place of a name is a symptom of imprecision in word choice and should be discouraged.

currentUserName is clumsier to read and write than user. It takes up more space, takes more time to read and write, yet offers no additional information. Even single-character names, like u, can be acceptable if the variable is short-lived enough.

The relationship of reasonable variable names to line length is obvious: long names lead to long lines. A limit prompts developers to reconsider badly chosen names.

Greater accessibility

Short lines are versatile. They are easy to diff, merge, and compare. They remain readable in any environment: IDEs, browsers, terminals, patches, tickets, Slack messages, issue comments, and error messages.

Often, developers only consider what their code looks like in the environment in which it was written. But code is read more often than it is written, and their editor of choice is only one of many places their code might be read from. When writing code, the author should keep in mind that not every environment, nor every person, has the same screen space as they do.

Of course, it’s possible to scroll, enable soft wrapping, or copy-paste code into an editor to mitigate this issue. Any code quality problem can be worked around with enough tooling. But tooling is expensive, and the cost to use tools is paid every time the code is read. By comparison, keeping lines short is a cost that only needs to be paid once, when the code is written.

Suggested limits

An 80 character line limit is ideal for most circumstances. Rarely, an increase to 100 or 120 characters may be necessary for some projects, and it is better to have a more relaxed limit than to give up on character limits altogether, but a surprising amount of code can be clearly expressed in 80 characters.

About tabs

Some languages permit, or worse, require, tabs for indentation. To agree on a maximum line length, all developers must also agree on the width of a tab.

Like the limit itself, consistency is more important than the exact value. EditorConfig is the standard here, supported by most IDEs and even some code hosts, such as GitHub.

There seems to be a growing consensus of tabs being four spaces. Many IDEs come with tabs pre-configured to four. Even some in-browser coding applications, such as the Go Tour, agree with this tab width.

Personally, I find myself agreeing with this consensus. Eight is the historical value, but even given a generous line limit, I find that it makes code cramped and uncomfortable to write only two tabs in, which is not a reasonable nesting limit in most languages.

Limit limitations

You would be forgiven for wondering if a limit like this won’t this just turn every line of code into a code golfing contest. Of course, developers can work around this limitation in unwise ways and create terrible programs. But this is true of every linter; those who think otherwise simply lack imagination.

Line limits go hand-in-hand with code formatters, which make egregious workarounds much more difficult.

Finally, some lines will absolutely need to be long, like URLs. If you run up against this problem, you’ll need a more complex linter than the simple awk program described earlier. Since you’ve gotten to the end of this essay, here’s a more flexible example, the very code used to lint this site:

1
2
3
4
5
6
7
8
9
10
11
12
13
#!/usr/bin/env awk -f
BEGIN {
    cols = 80
    tabs = 4
    skip = "(#|//) nolimit$"
}
$0 !~ skip {
    gsub(/\t/, sprintf("%"tabs"s", ""))
    if (length < cols) next
    print FILENAME ":" FNR ":" length "\n    " $0
    code = 1
}
END { exit code }

And here’s an example find command searching for Markdown files and running them against our awk linter:

1
find . -type f -name \*.md -exec linelimit.awk {} \;