Implementing Exponential Back-Off

Exponential back-off is the process of a client periodically retrying a failed request over an increasing amount of time. It is a standard error handling strategy for network applications.

Besides being “required”, using exponential back-off increases the efficiency of bandwidth usage, reduces the number of requests required to get a successful response and maximizes the throughput of requests in concurrent environments.

The flow of implementing a simple exponential back-off is as follows:

  1. Make a request to the API

  2. Receive the response, check for error that has a retry-able error code (such as 503)

  3. Wait 1s + random_number_milliseconds seconds

  4. Retry request

  5. Receive the response, check for error that has a retry-able error code (such as 503)

  6. Wait 2s + random_number_milliseconds seconds

  7. Retry request

  8. Receive the response, check for error that has a retry-able error code (such as 503)

  9. Wait 4s + random_number_milliseconds seconds

  10. Retry request

  11. Receive the response, check for error that has a retry-able error code (such as 503)

  12. Wait 8s + random_number_milliseconds seconds

  13. Retry request

  14. Receive the response, check for error that has a retry-able error code (such as 503)

  15. Wait 16s + random_number_milliseconds seconds

  16. Retry request

  17. If you still get an error, stop and log the error

Note: random_number_milliseconds MUST be redefined after each “Wait”

In the above flow, random_number_milliseconds is a random number of milliseconds less than or equal to 1000. This is necessary to avoid certain lock errors in some concurrent implementations.

Note: the wait is always (2^n) + random_number_milliseconds, where n is a monotonically increasing integer initially defined as 0. N is incremented by 1 for each iteration (each request)

The algorithm is set to terminate when n == 5. This ceiling is in place to prevent clients from retrying infinitely, and results in a total delay of 32 seconds before a deemed “unrecoverable error.”

Last updated