6

Error Handling

This chapter is about dealing with errors and panics in a concurrent program. First, we will look at how error handling can be incorporated into concurrent programs, including how to pass errors between goroutines so that you can handle or report them. Then we will talk about panics.

There are no hard and fast rules about dealing with errors and panics, but I hope some of the guidelines described in this chapter will help you write more robust code. The first guideline is this: never ignore errors. The second guideline tells you when to return an error and when to panic: the audience for errors is the users of the program; the audience for panics is the developers of the program.

This chapter contains the following sections:

  • Error handling
  • Panics

At the end of this chapter, you will have seen several approaches to error handling in concurrent programs.

Error handling

Error handling in Go has been a polarizing issue. Frustrated with the repetitive error-handling boilerplate, many Go users in the community (including me) suggested improved error-handling mechanisms. Most of these proposals were actually error-passing improvements because, to be honest, errors are rarely handled. Rather, they are passed to the caller, sometimes wrapped with some contextual information.

A good number of error-handling proposals suggested different variations on throw-catch, while many others were simply what is called syntactic sugar for if err!=nil return err. Many of these proposals missed the point that the existing error reporting and handling conventions work nicely in a concurrent environment, such as the ability to pass errors via a channel: you can handle errors generated by one goroutine in another goroutine.

An important point I like to emphasize when working with Go programs is that most of the time, they can be analyzed by solely relying on the information you see on screen. All possible code paths are explicit in the code. In that sense, Go is an extremely reader-friendly language. The Go error handling paradigm is partly responsible for this. Many functions return some values combined with an error. So, how the program deals with errors is usually explicit in the code.

Errors are generated by the program when an unacceptable situation is detected, such as a failing network connection, or an invalid user input. These errors are usually passed to the caller, sometimes wrapped in another error to add information describing the context. The added information is important because many of these errors are converted into messages for the users of the program. For example, a message complaining about a JSON parsing error is useless in a program that works with many JSON files unless it also tells which file has the error.

But goroutines do not return errors. We have to find other ways to deal with errors when goroutines fail. When multiple goroutines are used for different parts of a computation, and one of the goroutines fails, the remaining goroutines should also be canceled, or their results thrown away. Sometimes, multiple goroutines fail, and you have to deal with multiple error values. There are few guidelines and many third-party packages that will make error handling easy for you. The first guideline is: never ignore errors.

Let’s look at some common patterns. If you submit work to a goroutine and expect to receive a result later, make sure that result includes error information in it. The pattern illustrated here is useful if you have multiple tasks that can be performed concurrently. You start each task in its own goroutine and then collect the results or errors as needed. This is also a good way of dealing with errors when you have a worker pool:

// Result type keeps the expected result, and the
// error information.
type Result1 struct {
     Result ResultType1
     Error err
}
type Result2 struct {
     Result ResultType2
     Error err
}
...
result1Ch:=make(chan Result1)    
go func() {
     result, err := handleRequest1()
     result1Ch <- Result1{ Result: result, Error: err }
}()
result2Ch:=make(chan Result2)    
go func() {
     result, err := handleRequest2()
     result2Ch <- Result2{ Result: result, Error: err }
}() 
// Do other work
...
// Collect the results from the goroutines
result1:=<-result1Ch
result2:=<-result2Ch 
if result1.Error!=nil {
// handle error
       ...
}
if result2.Error!=nil {
// handle error
       ...
}

You have to be mindful of the active goroutines when you detect an error. For example, the preceding program reads from all the result channels before checking for errors. This ensures that all the goroutines that were started terminate, and then error handling is performed. The following implementation would leak the second goroutine:

result1:=<-result1Ch
if result1.Error!=nil {
// result2Ch is never read. Goroutine leaks!
       return result1.Error
}
result2:=<-result2Ch
...

In many cases, it may be just fine to let all goroutines complete and then return the error, or return a composite error if multiple goroutines failed. But sometimes, you may want to cancel running goroutines if another goroutine fails. In Chapter 8, we will talk about using context.Context to cancel such computations. For now, we can use a canceled channel to notify the goroutines that they should stop processing. If you remember, this is a common pattern where the closing of a channel is used to broadcast a signal to all the goroutines. So, when a goroutine detects an error, it will close the canceled channel. All goroutines will periodically check whether the canceled channel is closed and return an error if so. But there is a problem with this approach: if more than one goroutines fail, they will all try to close the channel, and closing an already closed channel will panic. So instead of closing the canceled channel, we will have a separate goroutine that listens to a cancel channel, and closes the canceled channel only once:

// Separate result channels for goroutines
resultCh1 := make(chan Result1)
resultCh2 := make(chan Result2)
// canceled channel is closed once when a goroutine
// sends to cancelCh
canceled := make(chan struct{})
// cancelCh can receive many cancellation requests,
// but closes canceled channel once
cancelCh := make(chan struct{})
// Make sure cancelCh is closed, otherwise the
// goroutine that reads from it leaks
defer close(cancelCh)
go func() {
     // close canceled channel once when received 
     // from cancelCh
     once := sync.Once{}
     for range cancelCh {
           once.Do(func() {
                close(canceled)
           })
     }
}()
// Goroutine 1 computes Result1
go func() {
     result, err := computeResult1()
     if err != nil {
          // cancel other goroutines
           cancelCh <- struct{}{}
           // Send error back. Do not close channel
           resultCh1 <- Result1{Error: err}
           return
     }
     // If other goroutines failed, stop computation
     select {
          case <-canceled:
               // close resultCh1, so the listener does 
               // not block
               close(resultCh1)
               return
     default:
     }
     // Do more computations
}()
// Goroutine 2 computes Result2
go func() {
   ...
}()
// Receive results. The channel will be closed if
// the goroutine was canceled (ok will be false)
result1, ok1 := <-resultCh1
result2, ok2 := <-resultCh2

Here, if goroutine 1 fails, resultCh1 will return the error, goroutine 2 will be canceled, and resultCh2 will be closed. If goroutine 2 fails, resultCh2 will return the error, goroutine 1 will be canceled, and resultCh1 will be closed. If they both fail concurrently, both errors will be returned.

A variation of this is using an error channel instead of a cancel channel. A separate goroutine listens to the error channel and captures the errors from the goroutines:

// errCh will communicate errors
errCh := make(chan error)
// Any error will close canceled channel
canceled := make(chan struct{})
// Ensure error listener terminates
defer close(errCh)
// collect all errors.
errs := make([]error, 0)
go func() {
     once := sync.Once{}
     for err := range errCh {
           errs = append(errs, err)
           // cancel all goroutines when error received
           once.Do(func() { close(canceled) })
     }
}()
resultCh1 := make(chan Result1)
go func() {
    defer close(resultCh1)
     result, err := computeResult()
     if err != nil {
           errCh <- err
           // Make sure listener does not block
           return
     }
     // If canceled, stop
     select {
     case <-canceled:
           return
     default:
     }
     resultCh1 <- result
}()
result, ok := <-resultCh1

Yet another error handling approach that I often see in the field is using dedicated error variables in the enclosing scope for each goroutine. This approach needs a WaitGroup, and there is no way to cancel work when one of the goroutines fails. Nevertheless, it can be useful if none of the goroutines perform cancelable operations. If you end up implementing this pattern, make sure errors are read after the wait group’s Wait() call because, according to the Go Memory Model, the setting of the error variables happens before the return of that Wait() call, but they are concurrent until then:

wg := sync.WaitGroup{}
wg.Add(2)
var err1 error
go func() {
     defer wg.Done()
     if err := doSomething1(); err != nil {
           err1 = err
           return
     }
}()
var err2 error
go func() {
     defer wg.Done()
     if err := doSomething2(); err != nil {
           err2 = err
           return
     }
}()
wg.Wait()
// Collect results and deal with errors here
if err1 != nil {
     // handle err1
}
if err2 != nil {
     // handle err2
}

Pipelines

There are several options for error handling when working with an asynchronous pipeline. A pipeline is usually constructed to process many inputs. Because of this, it is usually not desirable to stop the pipeline just because processing failed for one of the inputs. Instead, you log or record the error and continue processing. The important thing is to capture enough context with the error so that after everything is said and done, you can go back and figure out what went wrong for what input. The options to deal with errors in a pipeline include, but are not limited to, the following:

  • Each stage handles errors itself by using an error recorder function. The error recorder must be able to deal with concurrent calls if multiple stages attempt to record errors concurrently.
  • Use a separate error channel with an error listener goroutine. When an error is detected in the pipeline, the relevant context is captured (the input filename, identifier, or the complete input, what went wrong, which stage failed, etc.) and sent to a channel. The error listener goroutine stores the error information in a database or logs it.
  • Pass the error to the next stage. Each stage checks whether the input contains errors and passes it along until the end of the pipeline, where an error output is produced.

Servers

When I talk about servers, I mainly talk about their request-oriented nature and not about their communication characteristics. The requests may come from a network via HTTP or gRPC or they can come from the command line. Usually, each request is handled in a separate goroutine. Thus, it is up to the request-handling stack to propagate meaningful errors that can be used to build a response to the user. If that user is another program (that is, if we’re talking about a web service, for instance), it makes sense to include an error code and some diagnostic message. Structured errors are your best friend:

// Embed this error to all other structured errors that can be returned from the API
type Error struct {
    Code int
    HTTPStatus int
    DiagMsg string
}
// HTTPError extracts HTTP information from an error
type HTTPError interface {
   GetHTTPStatus() int
   GetHTTPMessage() string
}
func (e Error) GetHTTPStatus() int {
  return e.HTTPStatus
}
func (e Error) GetHTTPMessage() string {
   return fmt.Sprintf("%d: %s",e.Code,e.DiagMsg)
}
// Treat HTTPErrors and other unrecognized errors
//separately
func WriteError(w http.ResponseWriter, err error) {
   if e, ok:=err.(HTTPError); ok {
      http.Error(w,e.HTTPStatus(),e.HTTPMessage())
   } else {
      http.Error(w,http.InternalServerError,err.Error())
   }
}

Error implementations such as the preceding one will help you return meaningful errors to your users, so they can deal with common problems without wasting hours in frustration.

Panics

Panics are different from errors. A panic is either a programming error or a condition that cannot be reasonably remedied (such as running out of memory.) Because of this, a panic should be used to convey as much diagnostic information to the developer as possible.

Some errors can become panics depending on the context. For instance, a program may accept a template from the user and generate an error if the template parsing fails. However, if the parsing of a hardcoded template fails, then the program should panic. The first case is a user error, and the second case is a bug.

As a developer of concurrent programs, there are only three things you can do with an error: you either handle it (log it, choose another program flow, or ignore it by doing nothing), you pass it to the caller (sometimes with some additional contextual information), or you panic. When a panic happens in a concurrent program, the runtime ensures that all the nested function calls return, one by one, all the way to the function that started that goroutine. While this is happening, all deferred blocks of the functions also run. This is a chance to recover from a panic, or to clean up any resources that will not be garbage-collected. If the panic is not handled by one of the functions in the call chain, the program will crash. So, as the developer, you have some cleaning up to do.

In a server program, usually, a separate goroutine handles each request. Most server frameworks (including the standard library net/http package) handle such panics without crashing by printing a stack and failing the request. If you are writing a server without using such a library or if you want to report more information when you catch a panic, you should handle them yourself:

func PanicHandler(next func(http.ResponseWriter,*http.Request)) func(http.ResponseWriter,*http.Request) {
  return func(wr http.ResponseWriter, req *http.Request) {
    defer func() {
        if err:=recover(); err!=nil {
           // print panic info
        }
    }()
    next(wr,req)
   }
}
func main() {
     http.Handle("/path",PanicHandler(pathHandler))
}

You can only recover a panic in the goroutine if it is initiated. That means if you start a goroutine that can initiate a panic and you do not want that panic to terminate the program, you have to recover:

go func(errCh chan<- error, resultCh chan<- result) {
     defer func() {
          if err:=recover(); err!=nil {
               // panic recovered, return error instead
               errCh <- err
          close(resultCh)
          }
     }()
     // Do the work
}()

When working with concurrent processing pipelines (such as the ones we worked on in Chapter 5), it makes sense to deal with panics defensively. A panic usually points to a bug in the program but terminating a long-running pipeline after hours of processing is not the best solution. You usually want to have a log of all the panics and errors once the processing is complete. So, you have to make sure that the panic recovery is performed at the correct place. For instance, in the following code snippet, the panic recovery is around the actual pipeline stage processing function, so a panic is recorded, but the for loop continues processing:

func pipelineStage[IN any, OUT WithError](input <-chan IN, output chan<- OUT, errCh chan<-error, process func(IN) OUT) {
     defer close(output)
     for data := range input {
           // Process the next input
           result, err := func() (res OUT,err error) {
                defer func() {
                      // Convert panics to errors
                      if err = recover(); err != nil {
                           return
                      }
                }()
                return process(data),nil
           }()
           if err!=nil {
                // Report error and continue
                errCh<-err
                continue
           }
           output<-result
     }
}

If you are familiar with exception-handling mechanisms in C++ or Java, you might wonder whether panics can be used in place of throwing an exception. Some guidelines strongly discourage that, but you can find other resources advocating the exact same thing. I will leave that judgment to you, but there are examples of it in the standard library JSON package as well, and one might argue that if you have a large package with very few exported functions, it may make sense to use panics as an error-handling mechanism because it becomes an implementation detail. JSON unmarshaling is one example, and a deeply nested parser is another case where this might help. If you decide to do it, here’s the way: use package-level error types to distinguish between real panics and errors. The following snippet is a modified version of the standard library JSON unmarshaling implementation:

// All internal functions panic with this type of
// error instead of returning an error
type packageError struct{ error }
// Exported function is the top-level function that
// calls the unexported implementation functions and 
// recovers panics
func ExportedFunction() (err error) {
     defer func() {
           if r := recover(); r != nil {
           // If the panic is an error thrown from the
           // package recover and return error
                if e, ok := r.(packageError); ok {
                      err = e.error
                } else {
                      // This is a real panic
                      panic(r)
                }
           }
     }()
     unexportedFunction()
     return nil
}
// unexportedFunction is the top-level of the
// implementation
func unexportedFunction() {
   if err:=doThings(); err!=nil {
      panic(packageError{err})
   }
   ...
}

Here, unexportedFunction performs the actual work, and ExportedFunction acts as the external interface of unexportedFunction by translating some of the panics into errors.

Summary

Your programs must generate useful error messages that tell the users what went wrong and how they can fix it. Go gives the developer full control over how errors are generated and how they are passed around. In this chapter, we saw some methods to deal with errors generated concurrently.

Next, we will look at timers and tickers for scheduling events in the future.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.216.249