Rate limiting sets specific guidelines for user or system requests to an application or application programming interface (API). Though requests are often for a particular server (such as a business’s website), they are processed at the application level on the server. Administrators can navigate to their server’s configuration file and write their own rate limits, defined specifically for their website. Rate limits are set to prevent distributed denial of service (DDoS) attacks or to keep an application from being overwhelmed by excessive traffic.
DDoS attacks overwhelm a server with multiple requests from multiple Internet sessions, typically all controlled by one hacker. This sometimes forces that server to shut down temporarily and is very difficult to halt once the attack has been launched. Rate limiting attempts to prevent being overwhelmed by limiting how many users can access the server or how much traffic the server itself will allow at one time. This manages server resources as well so that it can run more efficiently. Rate limiting provides greater security for application programming interfaces (APIs) and optimizes their efficiency. APIs, which manage application design and application interactions, require rate limiting to avoid being overwhelmed.
In data centers and cloud platforms, rate limiting manages traffic, depending on the volume of activity. Ideally, in such environments, computing processes will set rate limits automatically without requiring human administration.