Tools & Infrastructure

How to Add Rate Limiting to Your API

Learn how to protect your API from overload and abuse — the simple way.

Scroll to start

What Is Rate Limiting?

Rate limiting is a set of rules that controls how many requests a user or app can make to your API in a given period of time. Think of it like a bouncer at a club — it lets a certain number of people in per minute, and anyone who tries to push past that limit gets turned away with a friendly error message instead of crashing the whole party.

When you build an API, anyone with the link can call it — and that includes bad actors who might hammer your server with thousands of requests per second, deliberately or accidentally. Rate limiting puts a speed bump on that behavior. It keeps your server healthy, your costs down, and your honest users happy.

Why You Need It from Day One

APIs are shared resources. If one user hammers your API with 10,000 requests in a single minute, the server has less power left to answer everyone else. In the worst case, your site goes down for everyone. In the best case, things get slow and frustrating.

Rate limiting also protects you financially. Many API hosting services charge based on how much computing power you use — and runaway requests can rack up a bill fast. A few simple limits can keep your costs predictable.

💡 Key Insight

Rate limiting isn't about blocking users — it's about making sure everyone's fair share of your server is available. Without limits, one loud user can quiet everyone else.

Simple Steps: How Rate Limiting Works

Here's the basic idea in plain steps:

01

Track the requester

Every request to your API comes from somewhere — a user's IP address, their API key, or their account ID. Your rate limiter keeps a counter for each one.

02

Count requests in a time window

The limiter starts a timer (say, one minute). Each time that user makes a request, the counter goes up. When the timer resets, the counter resets too.

03

Let them in or block them

If the counter is below the limit, the request goes through normally. If it's over the limit, the API returns an HTTP 429 "Too Many Requests" error and the user has to wait.

04

Tell them when they can try again

The 429 response usually includes a header called Retry-After that tells the user how many seconds to wait before trying again. Good API clients respect this automatically.

A Simple Rate Limiter in Express

Here's a basic rate limiter using the popular express-rate-limit package for Node.js. This example allows 100 requests per 15 minutes per IP address.

server.js
const express = require('express');
const rateLimit = require('express-rate-limit');

const app = express();

// Create the rate limiter
const apiLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,  // 15 minutes
  max: 100,                   // 100 requests per window
  message: 'Too many requests. Please try again in 15 minutes.',
});

// Apply to all /api routes
app.use('/api', apiLimiter);

app.get('/api/hello', (req, res) => {
  res.json({ message: 'Hello, world!' });
});

app.listen(3000, () => {
  console.log('Server running on port 3000');
});

That's it — five lines of code and your API has a basic guardrail. The limiter tracks each IP address automatically and blocks any that cross the 100-request threshold in a 15-minute window.

Knowledge Check

Test what you learned with this quick quiz.

Quick Quiz — 3 Questions

Question 1
What HTTP status code does a rate-limited API return when a user exceeds the limit?
Question 2
What does the Retry-After header tell the client?
Question 3
Why does rate limiting help control API hosting costs?
🏆

You crushed it!

Perfect score on this module.