Rate Limiting

Saving the system from sudden & unexpected rate of requests

Harshit Sharma
2 min readApr 10, 2021
Go logo

The Problem

In a usual scenario, you actually scale your systems to handle the number of requests that you already predicted. For a system that is capable of handling 100 requests at a time, the expected traffic is not more than 80 concurrent requests. And a buffer of 20 requests is kept in order to sustain a sudden spike.

But what if, a spike of 30 requests came in. Will the system be able to handle these many requests?

The answer is NO. The above scenario will result in a completely totaled system that also dropped the already processing requests.

What happened?

Let's say the expected maximum number of requests was to be 80 at a time. Keeping any sudden spikes in mind, the system was designed to handle 100 requests at a time.

But in an unexpected scenario, 80 requests came and were under processing, and just after that, a spike of 30 more requests came. Here, the system maxed out its compute resources at 100 requests. These 10 extra requests made the system completely blackout resulting in the dropping of already processing requests too.

The Solution

Considering the above problem, it was caused due to the 10 requests that the system was not designed to handle. And due to these extra requests, all the other requests suffered.

So, in order to prevent such a scenario, we can implement a simple algorithm, where we just need to back off these extra requests and ask them to come again as we do not have any space for them now.

For this let's maintain a request pool that allows only certain requests to be processed at a time and tells all the extra requests to try again after some time.

Eg,
Our system is expected to handle 4 requests at a time, so we initiate a request pool of 4 requests on a system that is capable of handling 5 requests. This buffer of 1 request will support these extra calculations that we will be performing.

Implementing in Go

--

--