Orchestrating the execution of individual services

So far, we have created three services for our monolith that all implement the Service interface. Now, we need to introduce a supervisor for coordinating their execution and making sure that they all cleanly terminate if any of them reports an error. Let's define a new type so that we can model a group of services and add a helper Run method to manage their execution life cycle:

type Group []Service

func (g Group) Run(ctx context.Context) error {...}

Now, let's break down the Run method's implementation into smaller chunks and go through each one:

if ctx == nil {
    ctx = context.Background()
}
runCtx, cancelFn := context.WithCancel(ctx)
defer cancelFn()

As you can see, first, we create a new cancelable context that wraps the one that was externally provided to us by the Run method caller. The wrapped context will be provided as an argument to the Run method of each individual service, thus ensuring that all the services can be canceled in one of two ways:

By the caller if, for instance, the provided context is canceled or expires
By the supervisor, if any of the services raise an error

Next, we will spin up a goroutine for each service in the group and execute its Run method, as follows:

var wg sync.WaitGroup
errCh := make(chan error, len(g))
wg.Add(len(g))
for _, s := range g {
    go func(s Service) {
        defer wg.Done()
        if err := s.Run(runCtx); err != nil {
            errCh <- xerrors.Errorf("%s: %w", s.Name(), err)
            cancelFn()
        }
    }(s)
}

If an error occurs, the goroutine will annotate it with the service name and write it to a buffered error channel before invoking the cancel function for the wrapped context. As a result, if any service fails, all the other services will be automatically instructed to shut down.

A sync.WaitGroup helps us keep track of the currently running goroutines. As we mentioned previously, we are working with long-running services whose Run method only returns if the context is canceled or an error occurs. In either case, the wrapped context will expire so that we can have our service runner wait for this event to occur and then call the wait group's Wait method to ensure that all the spawned goroutines have terminated before proceeding. The following code demonstrates how this is achieved:

    <-runCtx.Done()
    wg.Wait()

Before returning, we must check for the presence of errors. To this end, we close the error channel so that we can iterate it using a range statement. Closing the channel is safe since all the goroutines that could potentially write to it have already terminated. Consider the following code:

    var err error
    close(errCh)
    for srvErr := range errCh {
        err = multierror.Append(err, srvErr)
    }
    return err

As shown in the preceding snippet, after closing the channel, the code dequeues and aggregates any reported errors and returns them to the caller. Note that a nil error value will be returned if no error has occurred.