Chapter 5: Handling Errors

There are multiple ways to handle errors in your code. Most commonly in Ruby, errors are handled by raising exceptions, but there are other approaches used occasionally, such as returning nil for errors.

In this chapter, you'll learn about trade-offs in error handling, issues when handling transient errors with retries, and more advanced error handling such as exponential backoff and circuit breakers. You'll also learn how to design useful exception class hierarchies.

In this chapter, we will cover the following topics:

  • Handling errors with return values
  • Handling errors with exceptions
  • Retrying transient errors
  • Designing exception class hierarchies

By the end of this chapter, you'll have a better understanding of how best to handle errors in your Ruby programs.

Technical requirements

In this chapter and all chapters of this book, code given in code blocks is designed to execute on Ruby 3.0. Many of the code examples will work on earlier versions of Ruby, but not all. The code for this chapter is available online at https://github.com/PacktPublishing/Polished-Ruby-Programming/tree/main/Chapter05.

Handling errors with return values

In programming languages that do not support exceptions, errors are generally handled by using a return value that indicates failure. Ruby itself is written in C, and in C, functions that can fail will often use a return value that is zero on success, and non-zero on failure. While Ruby has exceptions, there are instances where methods can fail and this will occasionally return a value instead of raising an exception, even in cases where other programming languages raise an exception.

For example, in Python, if you have a hash (called a dictionary in Python), and you try to access a member in the hash that doesn't exist, you get an exception raised:

# Python code:

{'a': 2}['b']

# KeyError: 'b'

Ruby takes a different approach in this case, returning nil:

{'a'=>2}['b']

# => nil

This shows the two different philosophies between the languages. In Ruby, it is expected that when you are looking for a value in a hash, it may not be there. In Python, it is expected that if you are looking for a value in a hash, it should exist. If you want to get the Ruby behavior in Python, you can use get:

# Python code:

{'a': 2}.get('b', None)

# => None (Python equivalent of Ruby's nil)

Likewise, if you want to get the Python behavior in Ruby, you can use fetch:

{'a'=>2}.fetch('b')

# KeyError (key not found: "b")

Both Python and Ruby support similar behavior for retrieving data from hashes, but Ruby, in this case, is permissive, while Python, in this case, is strict.

In other cases, such as which objects are treated as false in conditionals, Python is permissive, and Ruby is strict. Ruby's permissiveness in either area can be considered a bug or a feature, depending on your point of view. Most programmers who prefer to use Ruby probably consider it a feature, since otherwise, they would probably prefer to use another language.

Ruby's permissiveness in the hash retrieval case is what allows for Ruby's very simple idiomatic memoization technique:

hash[key] ||= value

This is because this memoization construct is shorthand for the following code:

hash[key] || (hash[key] = value)

If hash[key] raised an exception in Ruby if key wasn't present in hash, this shorthand wouldn't work, and you would have to write a longer code that is more similar to the type of code needed in Python:

if hash.key?(key)

  hash[key]

else

  hash[key] = value

end

In general, the Ruby principle for data access via [] is that an exception is not raised if there is a way the access would work if the receiver included different data. You see this with arrays in the following code:

ary = [1, 2, 3]

ary[3]

# => nil

ary << 4

ary[3]

# => 4

Accessing the ary array with an index that is beyond the bounds of the array returns nil, because if the array is expanded later, the same call will be within the bounds of the array, and will return the value at that index.

You see this with hashes, shown in the following code:

hash = {1 => 2}

hash[3]

# => nil

hash[3] = 4

hash[3]

# => 4

Accessing hash with a key that does not currently exist in the hash returns nil, because if the key is added to the hash later, the same call will return the value associated with the key.

If you use the OpenStruct class in the standard library, you see that it operates the same way:

require 'ostruct'

os = OpenStruct.new

os[:b]

# => nil

os.b = 1

os[:b]

# => 1

As noted previously, the principle only applies if the receiver were to return an expected result if it included different data. If the call were to always fail regardless of which data the receiver included, Ruby will raise an exception. You can see this with a Struct subclass:

A = Struct.new(:a)

a = A.new(1)

a[:a]

# => 1

a[:b]

# NameError (no member 'b' in struct)

This is because no matter what kind of data the A instance contains, it will not have a b element, so this call will always fail.

There are two primary benefits of using return values to signal errors:

  • First, this approach offers much better performance than using exceptions, with pretty much the same performance in a successful case, and unintuitively, sometimes much better performance for an unsuccessful case than a successful case.
  • Second, if the error is common, it's easier for the user to deal with it instead of forcing them to rescue an exception.

Let's say you have a method that looks up a row in a database by the primary key of the row. In this case, the primary key is an integer column named id:

def pk_lookup(pk)

  database.first(<<-END)

    SELECT * FROM table where id = #{database.literal(pk)}

  END

end

Assuming database.first returns a hash or some other object when the row exists, and nil when the row does not exist, this is an example of a method that uses a return value to handle an error.

One issue with this method is that it will still run a query even if you know that the query will not return a row, such as when the value passed in is nil. Assuming that this is a case you want to optimize for, you can use this code:

def pk_lookup(pk)

  return unless pk

  database.first(<<-END)

    SELECT * FROM table where id = #{database.literal(pk)}

  END

end

The preceding code gives you the same behavior. However, it improves the performance of the case where the pk argument is nil, making it much faster than the success case since the database query is skipped.

The trade-off in this case is that every time you call pk_lookup, you cannot assume it will return a valid row. Code such as row = pk_lookup(1) will not raise an exception when pk_lookup is called if there is no matching row.

However, if row is used later and expected to be a hash or other object, the code will fail later, which may complicate debugging. In general, that's not a major issue, because if there is a problem due to not finding a row, you'll probably be alerted to it one way or another.

A more insidious case is when, in normal use of the method, you do not need the return value because the method is called for side effects. Consider the case where instead of looking up an object by primary key, you are updating the database. The following code demonstrates this:

def update(pk, column, value)

  database.run_update(<<-SQL)

    UPDATE table

    SET #{column} = #{database.literal(value)}

    WHERE id = #{database.literal(pk)}

  SQL

end

You can assume that database.run_update, in this case, returns the number of rows updated. In the general case, the return value of database.run_update is useful because an update can affect more than one row. However, because you are passing the primary key in this case, you are sure that it will never modify more than one row, and the return value may not be important. You may often call this method and ignore the return value by using this code:

update(self.id, :name, 'New Name')

The problem, in this case, is that if the database row with the current id doesn't exist, this method returns 0. However, since you aren't checking the return value, you don't know whether this code is making the expected changes.

This type of error can linger in code undetected for a long time, especially in code that is not commonly called. You may only find out months or years later that you have missed updates, and at that point, there may be nothing you can do to fix the previous cases affected by the error.

This is not a theoretical case; it can be a common problem when using a database library where a method such as save returns false for an unsuccessful save instead of raising an exception.

The principle here is to be especially wary of using return values to indicate errors when the caller of the code does not need to use the return value of the method. It is usually better to raise an exception in this case, which you'll learn more about in the next section.

In this section, you learned how to handle errors using return values, and the trade-offs in doing so. In the next section, you'll learn about the alternative approach, handling errors using exceptions.

Handling errors with exceptions

Raising exceptions is the most common way to handle errors in Ruby. All core methods in Ruby can raise an exception when called incorrectly. The easiest way to get a core method to trigger an exception is to pass it an incorrect number of arguments, as shown in the following code:

"S".length(1)

# ArgumentError (wrong number of arguments)

We can also get a core method to trigger an exception when passing the wrong type of argument:

'S'.count(1)

# TypeError (no implicit conversion of Integer into String)

In almost all cases, any unexpected or uncommon error should be raised as an exception, and not handled via a return value. Otherwise, as shown in the previous section, you end up with a case where the error is silently ignored. In the previous section, you saw an example where the update method using a return value to signal an error resulted in data loss. However, there are other cases where the results are even worse than data loss.

Consider a case where you are designing an authorization system. You have a class named Authorizer, and this has a singleton method named check that takes user and action, and should indicate whether user is authorized to perform an action. Here is a simple example of implementing such a class:

class Authorizer

  def self.check(user, action)

    new(user, action).authorized?

  end

  def authorized?

    return true if user.admin?

    return true if action == :view_own_profile

    false

  end

end

One way to use the Authorizer class would be as follows:

if Authorizer.check(current_user, :manage_users)

  show_manage_users_page

else

  show_invalid_access_page

end

Unfortunately, this has similar issues as seen in the previous section, where it can be misused. If a new programmer doesn't understand the API, they may assume from a method name such as check that it handles the error by raising an exception, and writes code such as the following:

Authorizer.check(current_user, :manage_users)

show_manage_users_page

This can be even worse than the data loss case described previously, and result in an elevation of privilege vulnerability in the application, or possibly even worse depending on which action is improperly allowed.

In this case, it's generally better for the Authorizer.check method to raise an exception:

class Authorizer

  class InvalidAuthorization < StandardError

  end

  def self.check(user, action)

    unless new(user, action).authorized?

      raise InvalidAuthorization,

        "#{user.name} is not authorized to perform #{action}"

    end

  end

end

By raising an exception, as the previous example does, you are forcing the user to handle the exception, avoiding the case where the failure is accidentally ignored. If Authorizer.check is implemented as in the previous example, and a new programmer doesn't understand the API, they may assume that it returns true to indicate that the action is authorized, and false to indicate that it is not. If they make that incorrect assumption, they would still have an issue. The following code demonstrates this:

if Authorizer.check(current_user, :manage_users)

  show_manage_users_page

else

  show_invalid_access_page

end

In the case where the action is authorized, the previous code works fine. However, in the case where the action is not authorized, an exception will be raised, instead of the invalid access page being shown. This is certainly a problem, but it's an easily fixable one.

There are two important principles here.

One of the principles is that when you are designing an API, you should not only design the API to be easy to use, but you should also attempt to design the API to be difficult to misuse. This is the principle of misuse resistance. A method that does not raise an exception for errors is easier to misuse than a method that raises an exception for errors.

Another of the principles at play is that of fail-open versus fail-closed design. In a fail-open design, if there is a problem with checking access, access is allowed. In a fail-closed design, if there is a problem with checking access, access is not allowed.

In most cases involving security, fail-closed is considered to be the superior model. In the example where Authorized.check returns true or false, misuse of the method results in the system failing open, and unauthorized access being allowed.

In the example where Authorized.check raises an Authorizer::InvalidAuthorization exception, misuse of the method results in the system failing closed, and unauthorized access not being allowed.

Now, there may be many cases where the user of Authorizer does need a true or false value for whether an action is authorized. For example, let's say you are showing a dashboard page and need to know whether to include a link to the page to manage users. You don't want to write the following code:

begin

  Authorizer.check(current_user, :manage_users)

rescue Authorizer::InvalidAuthorization

  # don't show link

else

  display_manage_users_link

end

The preceding code uses exceptions for flow control, which is, in general, a bad approach. In a case like this, it's usually better to have multiple methods. The Authorizer.check method should raise an exception, but if you want a true or false value, you can have a method such as the Authorizer.allowed? method, as shown in the following code:

class Authorizer

  def self.allowed?(user, action)

    new(user, action).authorized?

  end

end

Isn't this just the same as the first definition of the check method? Yes, it is. However, because the method name ends in ?, it signals to the user that this method will return a true or false value, and a user is much less likely to misuse it. With a method name such as check, it is ambiguous as to whether the method will return true or false or raise an exception, so misuse is much more likely to happen.

One other advantage of using exceptions to handle errors is that in many cases, higher-level code wants to handle the same type of error the same way. So, instead of having one hundred different if/else expressions in your application that use Authorizer.allowed?, as shown in the following code:

if Authorizer.allowed?(current_user, :manage_users)

  show_manage_users_page

else

  show_invalid_access_page

end

You can use a much simpler approach with Authorizer.check, as shown in the following code snippet:

code:Authorizer.check(current_user, :manage_users)

show_manage_users_page

Then, in a single place in your application, you have the following code that rescues the Authorizer::InvalidAuthorization exception and shows an appropriate page:

begin

  handle_request

rescue Authorizer::InvalidAuthorization

  show_invalid_access_page

end

In this section, you learned about maintainability and usability considerations when handling errors with exceptions. In the following section, you'll learn that handling errors with exceptions has performance considerations as well.

Considering performance when using exceptions

One reason to prefer handling errors via return values instead of exceptions is that return values, in general, perform much better. For simple methods, there isn't a way to get the exception handling approach even close to the return value approach in terms of performance.

However, for methods that do even minimal processing, such as a single String#gsub call, the time for executing the method is probably larger than the difference between the exception approach and the return value approach. Still, for absolute maximum performance, you do need to use the return value approach.

One consideration when using exceptions is that they get slower in proportion to the size of the call stack. If you have a call stack with 100 frames, which is quite common in Ruby web applications, raising an exception is much slower than if you only have a call stack with 10 frames.

The reason for this is that when you raise an exception the normal way, Ruby has to do a lot of work to construct the backtrace for the exception. Ruby needs to read the entire call stack and turn it into an array of Thread::Backtrace::Location objects.

Constructing that array gets slower in proportion to the size of the call stack. In general, the time to construct the array of Thread::Backtrace::Location objects is much longer than executing the non-local return to the appropriate exception handler (the rescue clause that will handle the exception).

Is there a way in which you can speed up the exception generation process? Thankfully, yes, there is. Instead of raising the exception the way you would normally, as follows:

raise ArgumentError, "message"

You can include a third argument to raise, which is the array to use for the backtrace. If you want to make the exception handling as fast as possible, you can use an empty array:

raise ArgumentError, "message", []

Like an empty array in exception arguments, you can make this even faster if you use a shared frozen constant:

# Earlier, outside the method

EMPTY_ARRAY = [].freeze

# Later, inside a method

raise ArgumentError, "message", EMPTY_ARRAY

As shown in the preceding example, by using a frozen constant, you can skip the allocation of an array when raising the exception.

Ruby allows you to construct an exception object manually, using an approach as per the following example:

exception = ArgumentError.new("message")

raise exception

If you are using the preceding approach, you can add a call to set_backtrace, so that raise will not try to generate the backtrace, as shown in the following code:

exception = ArgumentError.new("message")

exception.set_backtrace(EMPTY_ARRAY)

raise exception

However, this performance benefit has an associated cost. Because the exception being raised has no backtrace, it is much more difficult to debug if you run into problems. In general, if you want to use this approach, it is best to only use it for specific exception types. You should also make sure that you are specifically rescuing those exception types at some level above any methods you are calling that could raise the backtraceless exceptions.

Because backtraceless exceptions make debugging much more difficult, you should avoid using them by default in libraries. If you do want to support backtraceless exceptions in libraries for performance reasons, you should make the use of backtraceless exceptions only enabled via an option or setting. For example, if you have a module named LibraryModule and want to add support for backtraceless exceptions, you could add a skip_exception_backtraces accessor, as shown in this example:

exception = ArgumentError.new("message")

if LibraryModule.skip_exception_backtraces

  exception.set_backtrace(EMPTY_ARRAY)

end

raise exception

In this section, you learned about dealing with performance issues when raising exceptions. In the next section, you'll learn how to retry transient errors, using both the return value approach and exception approach.

Retrying transient errors

It's a fact of life, at least for a programmer, that some things fail all the time, but other things only fail occasionally. For those things that fail all the time, there is no point in retrying them. For example, if you call a method and it raises ArgumentError because you are calling it with the wrong number of arguments, as shown here:

nil.to_s(16)

You probably don't want to retry the preceding code, unless you expect that something will be redefining the NilClass#to_s method to accept an argument.

However, in many cases, especially those involving network requests, it is very common to encounter transient errors. In these cases, retrying errors makes sense. When making a network request, there may be multiple reasons why it may fail. Maybe the program at the other end of the request crashed and is being restarted. Maybe a construction crew accidentally cut a network cable between your computer and the computer you are connecting to, and failover to an alternative route hasn't happened yet. There are a vast number of possible reasons why transient errors could occur.

Thankfully, Ruby has a built-in keyword for handling transient errors, which is the retry keyword. Let's say you are writing a program that downloads data from a server using HTTP, given here:

require 'net/http'

require 'uri'

Net::HTTP.get_response(URI("http://example.local/file"))

The preceding program doesn't handle errors, so any exception raised when trying to download the file will result in an exception being reported and the program ending.

If one of the requirements for the program is that it absolutely must wait until the data is available, with no exceptions (pun intended), no matter how long it takes, and that if a failure happens, the download must be retried again as fast as possible, you could implement this with a rescue/retry combination, given here:

require 'net/http'

require 'uri'

begin

  Net::HTTP.get_response(URI("http://example.local/file"))

rescue

  retry

end

In general, the preceding approach is a bad idea, for multiple reasons. One reason is that it is a bad idea to retry on every exception type that could be raised. What happens if you make a typo in the protocol name, and it will not parse as a valid URI?

Well, then you end up with an infinite loop without it ever even attempting network access. You should almost always limit the errors you are retrying to specific exception classes. At least in this case, it might be useful to rescue errors related to sockets, system calls, and bad HTTP responses. It's even better to eliminate possible issues in URI creation, by moving the URI creation out of the loop. That also increases performance in the case where retry is needed, as given in the following code:

require 'net/http'

require 'uri'

uri = URI("http://example.local/file")

begin

  Net::HTTP.get_response(uri)

rescue SocketError, SystemCallError, Net::HTTPBadResponse

  retry

end

When combined, the changes to set the uri variable before the begin clause and only rescue specific exception classes make the preceding code better. However, it still has issues.

One issue is that just because Net::HTTP.get_response(uri) returns a value and doesn't raise an exception, it does not mean the value isn't an error. The HTTP protocol supports both client errors (4xx errors) and server errors (5xx errors), and the returned response could be one of those errors.

You can check whether the response is an error response by checking whether the response code is greater than or equal to 400. It would be nice if you could retry this in this case here:

require 'net/http'

require 'uri'

uri = URI("http://example.local/file")

begin

  response = Net::HTTP.get_response(uri)

  if response.code.to_i >= 400

    # retry # would be nice

  end

rescue SocketError, SystemCallError, Net::HTTPBadResponse

  retry

end

Unfortunately, if you uncomment the first retry line, you'll see that the code raises SyntaxError. Since the retry keyword is only valid inside rescue clauses, it is not valid in the begin clause. That's a bummer.

One way around this issue is to raise one of the exceptions you are rescuing, and then have retry in the rescue clause handling the retry, as shown in the following code:

require 'net/http'

require 'uri'

uri = URI("http://example.local/file")

begin

  response = Net::HTTP.get_response(uri)

  if response.code.to_i >= 400

    raise Net::HTTPBadResponse

  end

rescue SocketError, SystemCallError, Net::HTTPBadResponse

  retry

end

This does work, even if it seems like a code smell to use exceptions for flow control in this way.

What if your requirements change, and now you only want to retry on an HTTP client or server error, and not for other errors? In these cases, Net::HTTP does not raise an exception, so there is no reason to use a begin/rescue approach. One approach is a simple while loop, as shown in the following code:

require 'net/http'

require 'uri'

uri = URI("http://example.local/file")

while response = Net::HTTP.get_response(uri)

  break unless response.code.to_i >= 400

end

This works fine and causes no problems, but determining the intent of the code is much harder. This looks like a loop that will continuously request the page, not an approach for retrying on error.

It turns out that Ruby has something that allows retrying outside rescue clauses. Unfortunately, it has its own limitation, and that is the fact that it is only usable inside blocks.

The redo keyword is one of the least used keywords in Ruby. If you haven't used it before, it is similar to the next keyword, but instead of going to the next block iteration, it restarts the current block iteration. Because it is only usable in blocks, it's a little hacky to use it for retrying on an error, but it does a better job of showing intent.

The trick is, you need a block that will be called exactly once. Thankfully, you already know one way to tell a block to execute a given number of times by using Integer#times. The following code shows you how you could use the redo keyword to retry on error:

require 'net/http'

require 'uri'

uri = URI("http://example.local/file")

response = nil

1.times do

  response = Net::HTTP.get_response(uri)

  if response.code.to_i >= 400

    redo

  end

end

The advantage of the preceding code is that it conveys intent much better. You can see that by default, the block will only be called once, and it will only rerun the block if the response code indicates an error. Note that it's also possible to create a proc or lambda and just call it, but that generally performs worse as it requires allocating an object, unlike the approach of passing a block to Integer#times.

In general, procs and lambdas (Proc instances) are among the more expensive object instances to create, at least compared to other core classes.

Understanding more advanced retrying

In general, retrying an infinite number of times is a bad idea. If that is one of the requirements you are given, you may want to push back and see whether you can determine a reasonable limit. For network operations, retrying 2 to 5 times is not uncommon. Even retrying 100 times is probably better than always retrying.

It's fairly easy to retry a given number of times in Ruby. If you are using the exception approach to retrying, you can add a local variable for the number of retries, increment it with each exception, and use raise instead of retry if the local variable is over a specified number. If you wanted to retry a maximum of three times, the code would look like this:

require 'net/http'

require 'uri'

uri = URI("http://example.local/file")

retries = 0

begin

  Net::HTTP.get_response(uri)

rescue SocketError, SystemCallError, Net::HTTPBadResponse

  retries += 1

  raise if retries > 3

  retry

end

Similarly, if you are using the loop for retrying without exceptions, or the 1.times block with redo, you should switch to using Integer#times for the number of retries you want to allow, plus one for the initial attempt. The following code demonstrates this:

require 'net/http'

require 'uri'

uri = URI("http://example.local/file")

response = nil

4.times do

  response = Net::HTTP.get_response(uri)

  break if response.code.to_i < 400

end

Both of the preceding approaches are unfortunately too simple for most production usage. In general, retrying immediately is unlikely to get useful results in real-world situations.

You are likely to get better results if you wait between each retry attempt. How long you should wait depends on the situation, but in many network situations, waiting a few seconds is considered reasonable. If you want to wait a fixed amount of time between retries, you can add a sleep call before the retry. For example, the following code shows the case when we want to wait 3 seconds between retries:

require 'net/http'

require 'uri'

uri = URI("http://example.local/file")

retries = 0

begin

  Net::HTTP.get_response(uri)

rescue SocketError, SystemCallError, Net::HTTPBadResponse

  retries += 1

  raise if retries > 3

  sleep(3)

  retry

end

This approach in general is still too simple. In most real-world situations, you increase the amount of time between each retry. This provides a happy medium between too short of a retry time and too long of a retry time.

You send the first retry quickly, just in case there is a simple reason for the transient failure. However, after every retry, it looks less and less likely that the request will succeed if retried, so you wait longer between each retry. One approach to doing this is to start at 3 seconds, but double the amount of time in each retry. You can calculate this by multiplying the number of seconds to initially wait by 2 to the power of the number of retries already performed. The following code demonstrates this:

require 'net/http'

require 'uri'

uri = URI("http://example.local/file")

retries = 0

begin

  Net::HTTP.get_response(uri)

rescue SocketError, SystemCallError, Net::HTTPBadResponse

  retries += 1

  raise if retries > 3

  sleep(3 * 2**(retries-1))

  retry

end

This approach is decent, but it can result in the times to sleep growing quickly. For only 3 retries, it's probably fine, since you are retrying after 3 seconds, 9 seconds, and 21 seconds. However, if you are retrying 10 times, you will be waiting for close to an hour before all retries fail.

For a larger number of retries, you may want to decrease the exponentiation base. It's also a good idea to add some amount of randomness to the process if you have multiple processes using the same algorithm, to prevent a related problem called the thundering herd, where a large number of processes are retrying at exactly the same time and overwhelming the server. The following code is a modified implementation of a classic exponential backoff algorithm:

require 'net/http'

require 'uri'

uri = URI("http://example.local/file")

retries = 0

begin

  Net::HTTP.get_response(uri)

rescue SocketError, SystemCallError, Net::HTTPBadResponse

  retries += 1

  raise if retries > 3

  sleep(3 * (0.5 + rand/2) * 1.5**(retries-1))

  retry

end

With this approach, even with 10 retries, all retries will complete within 3 minutes.

In this section, you learned about advanced approaches to retrying in the face of transient errors. In the next section, you'll learn about how to avoid trying code that has recently raised transient errors, using an approach called a circuit breaker.

Breaking circuits

One related problem to retrying exceptions is when you have code you want to run, but isn't critical to the success of the program.

For example, if you are running a payment processing service, the actual payment processing is critical to the success of your business, so you want to do everything you can to make that work. However, your application may be calling an external service to get recommendations for the user making a request, and an external service to get advertisements to display on the page, and you would not want a failure of either service to affect the processing of payments.

Let's say you have code that looks like this:

begin

  @recommendations = recommender_service.call(timeout: 3)

rescue

end

@ads = ad_service.call(timeout: 3) rescue nil

process_payment

In general, it's not a good idea to use rescue nil, but if you really don't care why a service failed if it has failed, it can be okay to use. In this example, if either recommender_service or ad_service is temporarily down, payment processing will take 3 additional seconds. That can significantly affect how many payments you can process per hour, which can put a large dent in your bottom line.

In cases like this, you probably do not want to call either recommender_service or ad_service if they have been failing recently. For example, if you get three failing requests within a minute, you may want to not try the service until a minute after the first failing request.

You can build a simple class to handle this, called BrokenCircuit. The pattern this class implements is called a circuit breaker due to its similarity to physical circuit breakers in electrical engineering. You can start by having the constructor take a number of failures, and the number of seconds to wait. It will also use an array to store the failure times, as shown in the following code:

class BrokenCircuit

  def initialize(num_failures: 3, within: 60)

    @num_failures = num_failures

    @within = within

    @failures = []

  end

You can code the circuit breaker implementation by seeing whether the current number of failures is greater than the number of failures allowed. If it is allowed, then you get a cutoff time to remove older failures by subtracting the time to wait from the current time, and then removing any times from the failures array that are before the cutoff time.

Finally, you recheck whether the number of recent failures is still greater than the number allowed, and if so, you return without yielding to the block. If the number of recent failures is less than the number allowed, you yield to the block and rescue any exceptions. If there is an exception, you store the time of failure in the failures array, and return nil, as shown in this code here:

  def check

    if @failures.length >= @num_failures

      cutoff = Time.now - @within

      @failures.reject!{|t| t < cutoff}

      return if @failures.length >= @num_failures

    end

    begin

      yield

    rescue

      @failures << Time.now

      nil

    end

  end

end

Then you can set up your circuit breakers in your application. These are generally singleton objects, usually implemented as constants:

RECOMMENDER_CIRCUIT = BrokenCircuit.new

AD_CIRCUIT = BrokenCircuit.new

Then you can use the circuit breakers in your code prior to payment processing:

@recommendations = RECOMMENDER_CIRCUIT.check do

  recommender_service.call(timeout: 3)

end

@ads = AD_CIRCUIT.check do

  ad_service.call(timeout: 3)

end

process_payment

Generally speaking, production circuit breaker design is more complex and involved than all of the preceding examples, and you should probably use one of the many circuit breaker gems for Ruby instead of trying to implement a circuit breaker in your own code.

In this section, you learned all about retrying transient errors, including the basics of implementing circuit breakers. In the next section, you'll learn about how to design useful exception class hierarchies.

Designing exception class hierarchies

In general, if you are writing a library and raising an exception in it, it is useful to have a custom exception subclass that you use. Let's say you are passing an object to your method, and the object has to be allowed, or an exception should be raised. Ruby allows you to do this by using the following code:

def foo(bar)

  unless allowed?(bar)

    raise "bad bar: #{bar.inspect}"

  end

end

However, this is a bad approach, as it raises RuntimeError. In general, it is better to raise an exception class related to your library, since that allows users of your library to handle the exception differently from exceptions in other libraries. So if you have a library named Foo, it's common to have an exception class named something like Foo::Error that you can use for exceptions raised by the library. The following code demonstrates this:

module Foo

  class Error < StandardError

  end

  def foo(bar)

    unless allowed?(bar)

      raise Error, "bad bar: #{bar.inspect}"

    end

  end

end

It's important that Foo::Error is a subclass of StandardError and not of Exception. You should only subclass Exception in very rare cases because subclasses of Exception are not caught by rescue clauses without arguments. Using rescue with no exception classes given only rescues descendants of the StandardError class.

In general, it is best to keep your exception class hierarchy as simple as possible. If your code never explicitly raises an exception, do not create an exception class. When your code first needs to raise an exception, create a general Error class, such as Foo::Error. Thereafter, in future cases when raising an exception, use the same general Error class.

When should you have multiple exception classes in your library? In general, the only reason to use a separate exception class is for a type of error that users are likely to want to handle differently from other types of errors. For example, let's say in your library that there are two types of errors that can occur, permanent errors and transient errors.

In case of a transient error, it's possible that the same request will succeed in the future. However, if it is a permanent error, this means the same request will always fail in the future.

In this case, it makes sense to create a Foo::TransientError class:

module Foo

  class Error < StandardError

  end

  class TransientError < Error

  end

end

This way, users calling your library can rescue that particular exception class, and only retry in that case:

begin

  foo(bar)

rescue Foo::TransientError

  sleep(3)

  retry

end

How will you know which exceptions deserve separate exception classes and which exceptions do not? In many cases, you won't know. Unless you have a very clear idea that a particular exception should be treated differently, just use the generic exception class for your library when raising the exception.

Later, you may get reports for users that they want to treat a particular error case differently. The following code shows what users will often be doing in this case:

begin

  foo(bar)

rescue Foo::Error => e

  if e.message =~ /Abad bar: /

    handle_bad_bar(bar)

  else

    raise

  end

end

When you get a report that a user would like a new exception class created, then you can reanalyze the situation. At that point, you may want to create a subclass of the library generic exception class for that particular error, as well as change the particular exception raising location to use the new exception class, as shown in the following code:

module Foo

  class Error < StandardError

  end

  class TransientError < Error

  end

  class BarError < Error

  end

  def foo(bar)

    unless allowed?(bar)

      raise BarError, "bad bar: #{bar.inspect}"

    end

  end

end

The advantage of using the preceding approach for adding exception classes is that it is backward-compatible. The previous example, which rescues Foo:Error and checks e.message, still works. In the future, the user can switch to rescuing Foo::BarError, similar to this example:

begin

  foo(bar)

rescue Foo::BarError

  handle_bad_bar(bar)

end

The principle when designing exception class hierarchies is similar to the principle of designing class hierarchies in general, which is, to avoid exception class proliferation, and create only the exception classes necessary for users to appropriately handle exceptions raised by your library.

Using core exception classes

Note that in some cases, it may be permissible to use one of the built-in exception classes. For example, if you only want to accept a certain type of argument, you could raise TypeError if the passed argument is of the wrong type:

def baz(int)

  unless int.is_a?(Integer)

    raise(TypeError,

          "int should be an Integer, is #{int.class}")

  end

  int + 10

end

While this is an appropriate use of the TypeError exception class, it results in unidiomatic Ruby code. In general, idiomatic Ruby code avoids defensive programming based on types, because in Ruby, what matters is what methods the object responds to and the objects returned by those methods.

In Ruby, it shouldn't matter what actual class the object uses. Except in special cases, it's best to avoid this type of programming, and just use the object without explicitly checking its type. In this example, we pass the object directly as an argument to Integer#+:

def baz(int)

  10 + int

end

If Ruby needs to deal with the object internally, where the underlying type actually matters, Integer#+ will raise TypeError if int is not comparable to an integer. You don't generally need to do such TypeError checks, because Ruby does it for you.

Summary

In this chapter, you've learned how best to handle errors in your Ruby code. You've learned about handling errors using return values, handling errors with exceptions, and the trade-offs between the two approaches.

You've learned how to retry in the case of transient errors when using both approaches, and you've also learned about more advanced techniques, such as exponential backoff and circuit breakers. You've also learned how to properly design exception class hierarchies. Proper error handling is one of the more important aspects of programming, and now you are better prepared to implement errors properly in your application.

In the next chapter, you'll shift gears a little and learn how code formatting can affect maintenance.

Questions

  1. What is the main advantage of using return values to signal errors?
  2. What is the main advantage of using exceptions to signal errors?
  3. Why is it important not to retry transient errors immediately?
  4. When is a good time to add a subclass of an existing exception class?
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.124.244