When a website returns a 503 Service Unavailable status to a client, it is typically informing the browser that it is overloaded.
This is an example of throttling in action. The throttling pattern quickly responds to increased load by restricting the consumption of resources by an application instance, a tenant, or an entire service so that the system being consumed can continue to function and meet service level agreements.In theĀ example; shows a scenario where paying customers get priority when the system is under heavy load.
This pattern allows resource consumption up to some soft limit (that is, below the hard, maximum capacity of the system) that, when reached, causes the system to begin throttling the requests.
This could be by outright rejecting requests, degrading functionality (such as switching to a lower bit rate video stream), focusing on high priority requests (such as only processing messages from paid subscribers and not trial users), or deferring the requests for the clients to retry later (as in the HTTP 503 case).
Throttling is often paired with auto-scaling; since the time required to scale up is not instantaneous, the throttling can help keep the system operational until the new resources come online and then raise the soft limit after they are available.
If your web application consumes an external service (such as the SQL Database or Storage service), your application code must be aware of how the service may throttle your requests and handle the throttling exceptions properly (perhaps by retrying the operation later).
This makes your website more resilient instead of immediately giving up when a throttling exception is encountered.
If your web application is a service itself, implementing the Throttling pattern as a part of your service logic makes your website more scalable in the face of rapid increases in load.
I really appreciate your help with my project!