Differences Between Rate Limiting & Throttling of Web Api

- July 14, 2022

When dealing with cloud APIs, there will typically be the concept of Throttling when API consumption moves beyond a specific limit. The particular model of throttling (leaky bucket, burstable, etc) is outside the scope of this post. For the purposes of this post, we will assume a simple, numeric rate limit (ie: x Api calls per second).

The key concept is that the API is expensive either time, or compute and so there is a need to restrict the rate that the calls are made. Most developer code assumes that all APIs are free. Tight loops are common to operate code as fast as possible. Cloud APIs that distribute compute are not free and need special handling.

Rate Limiting is a client side response to the maximum capacity of a channel. If a channel has capacity to consume requests at a given rate/sec then a client should be prepared to limit their request rate. A common response to avoid implementing rate limiting on the client is that the server should allow processing at the appropriate rate. However, n the case where the API triggers asynchronous operations the response rate to the API may be fast, but the operation in the background is much slower.

Throttling is a server side response where feedback is provided to the caller indicating that there are too many requests coming in from that client or that the server is overloaded and needs clients to slow down their rate of requests. When a throttling event happens, the general client side response use exponential backoff to ensure that the system can recover even with multiple making requests at the same time.

With an assumed maximum rate of 10 requests/second and an API limit of 4 requests/second. We can immediately see that calling the API 1000 times we can sustain 100 seconds of requests. However the API itself would take 250 to complete the scan. This of course assumes that our client is the only caller inter that API, in a lot of cases AWS internally is making requests, you may have a secondary orchestrator making requests and finally you may have an infosec scanner making requests. So the in reality our client may only be able to get 1 or 2 requests/second.

Search This Blog

You Tube Channel: www.youtube.com/@bramanareddy0482

Differences Between Rate Limiting & Throttling of Web Api

Comments

Post a Comment

Popular posts from this blog

Email Sending through O365 using OAuth Protocol

Why should I migrate my .NET Framework applications to .NET Core/.NET 6?

What is private bytes, virtual bytes, working set comes under App pool worker process?