How to Add Rate Limiting to Your API
Learn how to protect your API from overload and abuse — the simple way.
What Is Rate Limiting?
Rate limiting is a set of rules that controls how many requests a user or app can make to your API in a given period of time. Think of it like a bouncer at a club — it lets a certain number of people in per minute, and anyone who tries to push past that limit gets turned away with a friendly error message instead of crashing the whole party.
When you build an API, anyone with the link can call it — and that includes bad actors who might hammer your server with thousands of requests per second, deliberately or accidentally. Rate limiting puts a speed bump on that behavior. It keeps your server healthy, your costs down, and your honest users happy.
Why You Need It from Day One
APIs are shared resources. If one user hammers your API with 10,000 requests in a single minute, the server has less power left to answer everyone else. In the worst case, your site goes down for everyone. In the best case, things get slow and frustrating.
Rate limiting also protects you financially. Many API hosting services charge based on how much computing power you use — and runaway requests can rack up a bill fast. A few simple limits can keep your costs predictable.
💡 Key Insight
Rate limiting isn't about blocking users — it's about making sure everyone's fair share of your server is available. Without limits, one loud user can quiet everyone else.
Simple Steps: How Rate Limiting Works
Here's the basic idea in plain steps:
Track the requester
Every request to your API comes from somewhere — a user's IP address, their API key, or their account ID. Your rate limiter keeps a counter for each one.
Count requests in a time window
The limiter starts a timer (say, one minute). Each time that user makes a request, the counter goes up. When the timer resets, the counter resets too.
Let them in or block them
If the counter is below the limit, the request goes through normally. If it's over the limit, the API returns an HTTP 429 "Too Many Requests" error and the user has to wait.
Tell them when they can try again
The 429 response usually includes a header called Retry-After that tells the user how many seconds to wait before trying again. Good API clients respect this automatically.
A Simple Rate Limiter in Express
Here's a basic rate limiter using the popular express-rate-limit package for Node.js. This example allows 100 requests per 15 minutes per IP address.
const express = require('express');
const rateLimit = require('express-rate-limit');
const app = express();
// Create the rate limiter
const apiLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // 100 requests per window
message: 'Too many requests. Please try again in 15 minutes.',
});
// Apply to all /api routes
app.use('/api', apiLimiter);
app.get('/api/hello', (req, res) => {
res.json({ message: 'Hello, world!' });
});
app.listen(3000, () => {
console.log('Server running on port 3000');
});
That's it — five lines of code and your API has a basic guardrail. The limiter tracks each IP address automatically and blocks any that cross the 100-request threshold in a 15-minute window.
Knowledge Check
Test what you learned with this quick quiz.