jointhefreeworld.org

Tips for clean code with Go

oktober 2, 2020

# Why Write Clean Code?

This document is a reference for the Go community that aims to help developers write cleaner code. Whether you’re working on a personal project or as part of a larger team, writing clean code is an important skill to have. Establishing good paradigms and consistent, accessible standards for writing clean code can help prevent developers from wasting many meaningless hours on trying to understand their own (or others’) work.

We don’t read code, we decode it – Peter Seibel

As developers, we’re sometimes tempted to write code in a way that’s convenient for the time being without regard for best practices; this makes code reviews and testing more difficult. In a sense, we’re encoding—and, in doing so, making it more difficult for others to decode our work. But we want our code to be usable, readable, and maintainable. And that requires coding the right way, not the easy way.

This document begins with a simple and short introduction to the fundamentals of writing clean code. Later, we’ll discuss concrete refactoring examples specific to Go.

About gofmt

I’d like to take a few sentences to clarify my stance on =gofmt=because there are plenty of things I disagree with when it comes to this tool. This tool allows us to have a common standard for writing Go code, and that’s a great thing. As a developer myself, I can certainly appreciate that Go programmers may feel somewhat restricted by gofmt, especially if they disagree with some of its rules. But in my opinion, homogeneous code is more important than having complete expressive freedom.

# Introduction to Clean Code

Clean code is the pragmatic concept of promoting readable and maintainable software. Clean code establishes trust in the codebase and helps minimize the chances of careless bugs being introduced. It also helps developers maintain their agility, which typically plummets as the codebase expands due to the increased risk of introducing bugs.

# Test-Driven Development

Test-driven development is the practice of testing your code frequently throughout short development cycles or sprints. It ultimately contributes to code cleanliness by inviting developers to question the functionality and purpose of their code. To make testing easier, developers are encouraged to write short functions that only do one thing. For example, it’s arguably much easier to test (and understand) a function that’s only 4 lines long than one that’s 40.

Test-driven development consists of the following cycle:

  1. Write (or execute) a test
  2. If the test fails, make it pass
  3. Refactor your code accordingly
  4. Repeat

Testing and refactoring are intertwined in this process. As you refactor your code to make it more understandable or maintainable, you need to test your changes thoroughly to ensure that you haven’t altered the behavior of your functions. This can be incredibly useful as the codebase grows.

# Naming Conventions

# Comments

I’d like to first address the topic of commenting code, which is an essential practice but tends to be misapplied. Unnecessary comments can indicate problems with the underlying code, such as the use of poor naming conventions. However, whether or not a particular comment is “necessary” is somewhat subjective and depends on how legibly the code was written. For example, the logic of well-written code may still be so complex that it requires a comment to clarify what is going on. In that case, one might argue that the comment is helpful and therefore necessary.

In Go, according to gofmt, all public variables and functions should be annotated. I think this is absolutely fine, as it gives us consistent rules for documenting our code. However, I always want to distinguish between comments that enable auto-generated documentation and all other comments. Annotation comments, for documentation, should be written like documentation—they should be at a high level of abstraction and concern the logical implementation of the code as little as possible.

I say this because there are other ways to explain code and ensure that it’s being written comprehensibly and expressively. If the code is neither of those, some people find it acceptable to introduce a comment explaining the convoluted logic. Unfortunately, that doesn’t really help. For one, most people simply won’t read comments, as they tend to be very intrusive to the experience of reviewing code. Additionally, as you can imagine, a developer won’t be too happy if they’re forced to review unclear code that’s been slathered with comments. The less that people have to read to understand what your code is doing, the better off they’ll be.

Let’s take a step back and look at some concrete examples. Here’s how you shouldn’t comment your code:

// iterate over the range 0 to 9
// and invoke the doSomething function
// for each iteration
for i := 0; i < 10; i++ {
  doSomething(i)
}

This is what I like to call a tutorial comment; it’s fairly common in tutorials, which often explain the low-level functionality of a language (or programming in general). While these comments may be helpful for beginners, they’re absolutely useless in production code. Hopefully, we aren’t collaborating with programmers who don’t understand something as simple as a looping construct by the time they’ve begun working on a development team. As programmers, we shouldn’t have to read the comment to understand what’s going on—we know that we’re iterating over the range 0 to 9 because we can simply read the code. Hence the proverb:

Document why, not how. – Venkat Subramaniam

Following this logic, we can now change our comment to explain why we are iterating from the range 0 to 9:

// instatiate 10 threads to handle upcoming work load
for i := 0; i < 10; i++ {
  doSomething(i)
}

Now we understand why we have a loop and can tell what we’re doing by simply reading the code… Sort of.

This still isn’t what I’d consider clean code. The comment is worrying because it probably should not be necessary to express such an explanation in prose, assuming the code is well written (which it isn’t). Technically, we’re still saying what we’re doing, not why we’re doing it. We can easily express this “what” directly in our code by using more meaningful names:

for workerID := 0; workerID < 10; workerID++ {
  instantiateThread(workerID)
}

With just a few changes to our variable and function names, we’ve managed to explain what we’re doing directly in our code. This is much clearer for the reader because they won’t have to read the comment and then map the prose to the code. Instead, they can simply read the code to understand what it’s doing.

Of course, this was a relatively trivial example. Writing clear and expressive code is unfortunately not always so easy; it can become increasingly difficult as the codebase itself grows in complexity. The more you practice writing comments in this mindset and avoid explaining what you’re doing, the cleaner your code will become.

# Function Naming

Let’s now move on to function naming conventions. The general rule here is really simple: the more specific the function, the more general its name. In other words, we want to start with a very broad and short function name, such as Run or Parse, that describes the general functionality. Let’s imagine that we are creating a configuration parser. Following this naming convention, our top level of abstraction might look something like the following:

func main() {
    configpath := flag.String("config-path", "", "configuration file path")
    flag.Parse()

    config, err := configuration.Parse(*configpath)
    ...
}

We’ll focus on the naming of the Parse function. Despite this function’s very short and general name, it’s actually quite clear what it attempts to achieve.

When we go one layer deeper, our function naming will become slightly more specific:

func Parse(filepath string) (Config, error) {
    switch fileExtension(filepath) {
    case "json":
        return parseJSON(filepath)
    case "yaml":
        return parseYAML(filepath)
    case "toml":
        return parseTOML(filepath)
    default:
        return Config{}, ErrUnknownFileExtension
    }
}

Here, we’ve clearly distinguished the nested function calls from their parent without being overly specific. This allows each nested function call to make sense on its own as well as within the context of the parent. On the other hand, if we had named the parseJSON function json instead, it couldn’t possibly stand on its own. The functionality would become lost in the name, and we would no longer be able to tell whether this function is parsing, creating, or marshalling JSON.

Notice that fileExtension is actually a little more specific. However, this is because its functionality is in fact quite specific in nature:

func fileExtension(filepath string) string {
    segments := strings.Split(filepath, ".")
    return segments[len(segments)-1]
}

This kind of logical progression in our function names—from a high level of abstraction to a lower, more specific one—makes the code easier to follow and read. Consider the alternative: If our highest level of abstraction is too specific, then we’ll end up with a name that attempts to cover all bases, like DetermineFileExtensionAndParseConfigurationFile. This is horrendously difficult to read; we are trying to be too specific too soon and end up confusing the reader, despite trying to be clear!

# Cleaning Functions

Now that we know some best practices for naming our variables and functions, as well as clarifying our code with comments, let’s dive into some specifics of how we can refactor functions to make them cleaner.