Many-Step Sequences in Go

Posted Mar 24, 2026

By Chris Lesiw

9 min read

Build scripts are generally lots of linear steps, executed in sequence, which fail if any individual step fails. This is straightforward to express in a shell scripting language: begin with set -e (or the ever-popular “Bash strict mode”), write out your sequence of steps, and execute.

Recently I’ve been working on translating some venerable Bash scripts into my command buffers. Moving from Bash to Go provides some immediate benefits, like type safety, modularity, and a more expressive language with fewer gotchas. But, on a first pass, the code tends to end up in a single mega-function that does everything, start to finish.

This is hard to reason about and hard to test. To verify the behavior of a single step, every preceding step has to succeed first, which means faking lots of behavior just to get to the part of the program we’re actually interested in. Tests like these are fragile and cumbersome. And when an early step breaks, it can cause downstream effects, breaking every test that relies on its behavior as a prerequisite. This defeats one of the useful properties of well-written unit tests: making it trivial to pinpoint which part of the program has the bug.

One way to address this is to break the function apart. Each operation becomes its own function, and an orchestrator calls them in sequence.

  
func deploy(ctx context.Context, d *deployState) error {
    switch d.os {
    case "linux":
        if err := installLinux(ctx, d); err != nil {
            return fmt.Errorf("linux install failed: %w", err)
        }
    case "darwin":
        if err := installDarwin(ctx, d); err != nil {
            return fmt.Errorf("macOS install failed: %w", err)
        }
    default:
        return fmt.Errorf("unsupported OS: %s", d.os)
    }
    if err := configure(ctx, d); err != nil {
        return fmt.Errorf("configure failed: %w", err)
    }
    if err := start(ctx, d); err != nil {
        return fmt.Errorf("start failed: %w", err)
    }
    return nil
}

Now individual steps can be tested independently. But the orchestrating function can still only be tested by running the entire sequence, despite most of that function no longer doing any interesting work. Yet branching decisions that impact the whole workflow still have to live in the orchestrator. And the error handling is noisy and verbose, which is often a symptom of a design problem.

Steps as functions

Rob Pike’s 2011 talk on lexical scanning in Go offers an alternative approach that might be applicable here. In it, he describes a state machine for lexing, where each state is a function that returns the next function to run.

  
type stateFn func(*lexer) stateFn

Each function does its work and returns the next state, or returns nil to stop. The driver is a for loop.

  
for state != nil {
    state = state(l)
}

The transitions live in the functions themselves. The driver doesn’t need to know about branching. The repetitive error checking disappears and error handling is represented as a new state.

This works well for lexing, where the state is a single, local value, and it makes the most sense to test at the boundary of the lexer itself. But for longer-running operations that involve network calls or external processes, context awareness and per-step testability would be useful additions.

There’s also something about this design that’s always left an itch in my mind. One of the natural code evolutions I come across again and again is that, once enough top-level functions take the same parameter, it tends to be a signal that they may be conceptually related enough to promote that parameter to a method receiver: l.state() rather than state(l).

Types as namespaces

Building on the idea that the type that holds a sequence’s state is a natural home for the methods that operate on it, method values are a natural way of referencing a context-aware function without requiring special handling of the state type.

In Go, m := l.state is valid. The result is a method value: a closure that captures l and can be called as m() instead of l.state().

This lets us add context awareness without complicating the function signature. Without method values, each step would need to accept both the state and the context, and the driver would need to know about both. Method values let the receiver bind at the call site, leaving context as the only parameter the driver supplies.

  
var (
    p     = new(pipeline)
    ctx   = context.Background()
    state = p.start
)
for state != nil {
    state = state(ctx)
}

This also provides the benefit of namespacing different sequences from one another. State and steps are rolled into one, making it clear which code is related to what sequence.

However, the driver’s ignorance of the state type comes at a cost. A hypothetical definition of a step.Func might look like:

  
type Func func(context.Context) Func

Which is perfectly valid, but makes steps fungible between sequences. Fortunately, type parameters give us a way to mark certain steps as belonging to the same sequence.

  
type Func[T any] func(context.Context) Func[T]

Despite no value of type T being passed or stored, its presence in the return type ensures it can only be compatible with other functions which return Func[T]. Once again, the state type is a natural fit for T here, and type inference means step.Do(ctx, p.start) compiles without explicit type arguments.

The type parameter is not a hard constraint — the compiler doesn’t enforce that methods on deploy must return Func[deploy]. But it provides a secondary namespace marker in the return type that makes it clear which sequence a step belongs to. When the convention is followed, it’s impossible for a step to return another step from the wrong sequence.

Errors

It’s possible to express error handling as another step in the sequence. However, most of the time, when executing a linear sequence of steps, it is desirable to stop at the first error encountered. To that end, in my library, lesiw.io/step, step.Func[T] includes an error value return as part of the function signature.

  
type Func[T any] func(context.Context) (Func[T], error)

The sequence driver, step.Do, stops the sequence at the first non-nil error value returned, and wraps the error with the name of the function that emitted it. It also checks context cancellation before each step.

  
err := step.Do(ctx, p.start)
if err != nil {
    log.Fatal(err) // prints "function: error"
}

This does not stop users from creating their own error flow by adding a new error-handling step, say, return p.handleErr(err), nil. It just makes the common case — stopping on the first error — work as expected.

When executing a sequence, it’s often useful to report the steps that have passed or failed. step.Do therefore takes a variable number of step.Handlers. The step.Handler interface has a single method, Handle, which takes information about the step that was executed and its returned error value.

This lets users write custom handlers to display the multi-step process as they see fit. One handler, step.Log, is provided for convenience. It writes to any io.Writer and presents successful steps with a check mark and unsuccessful steps with an x, followed by the error.

Users may also wish to represent outcomes beyond the binary of pass or fail. This can be accomplished by wrapping the error in step.Continue(err), which signals to the driver that it’s safe to continue despite the presence of an error value.

Handlers still receive the error — step.Log, for instance, renders continued steps with ⊘ — but Do proceeds to the next step instead of returning.

✔ detectOS
⊘ install: skip
✔ configure

Since the handler is called after each step, the handler itself can be a method on the state type. This is useful for buffered logging, where step output is captured and only shown on failure.

  
type deploy struct {
    bytes.Buffer
    os string
}

func (d *deploy) Handle(i step.Info, err error) {
    if err != nil {
        io.Copy(os.Stderr, d)
    }
    d.Reset()
}

The complete driver follows.

  
func Do[T any](ctx context.Context, f Func[T], h ...Handler) (err error) {
    for f != nil {
        if err = ctx.Err(); err != nil {
            return err
        }
        i := Info{Name: Name(f)}
        if f, err = f(ctx); err != nil {
            err = &Error{Info: i, error: err}
        }
        for _, handler := range h {
            handler.Handle(i, err)
        }
        if ce := new(continueError); err != nil && !errors.As(err, &ce) {
            return err
        }
    }
    return nil
}

Testing transitions

This works well as implementation code, but there are still challenges at test time. In Go, there is no way to write got == d.installLinux, for example. But the runtime knows the name of every function, and, in practice, that is enough to compare them.

The standard library already uses this technique in places. In net/ip_test.go, a helper resolves function names for test output.

  
func name(f any) string {
    return runtime.FuncForPC(reflect.ValueOf(f).Pointer()).Name()
}

lesiw.io/step does the same, and packages this into two functions.

  
func Name[T any](fn Func[T]) string {
    s := strings.Split(fullName(fn), ".")
    return strings.TrimSuffix(s[len(s)-1], "-fm")
}

func Equal[T any](a, b Func[T]) bool {
    return fullName(a) == fullName(b)
}

Name returns the short name, useful for logging and display. Equal compares fully qualified runtime names, so identically named functions in different packages cannot be conflated.

This makes it possible to write tests for transitions.

  
func TestInstallLinux(t *testing.T) {
    d := &deploy{os: "linux"}
    got, err := d.install(t.Context())
    if err != nil {
        t.Fatalf("install err: %v", err)
    }
    if want := d.installLinux; !step.Equal(got, want) {
        t.Errorf("got %s, want %s", step.Name(got), step.Name(want))
    }
}

Each step is tested independently. A bug in install does not cascade into installLinux tests. The error message names the exact transition that was wrong, following the got-before-want convention.

Publishing solutions

When a problem comes up more than once, it can be useful to solve it once and publish the result, whether as a blog post or as a library. lesiw.io/defers, for instance, handles program-wide defers that run on exit or OS signal. It’s a simple enough solution that a little copying would work fine, but publishing it as a library additionally provides the option to import it directly.

lesiw.io/step is another solution in the same spirit. If the approach described here is useful on its own, then I consider its goal met. But for those of us, like myself, who find code easier to read than documentation, check out the source.