Tools & Infrastructure

Rate Limiting and Throttling Explained

How websites control how many requests you can make — and why it keeps the whole internet running.

Scroll to start

01 — The Concept

Too Many Requests, Not Enough Seats

Imagine a coffee shop with 10 chairs. If 50 people try to sit down at once, there's chaos. The shop needs a rule: one person per seat, and if the seats are full, new people wait. That's rate limiting — a set of rules that controls how many times something can happen in a given period of time.

When you use an app or a website, your browser sends requests to a server. Servers can only handle so many requests at once — just like those 10 chairs. Rate limiting is how servers make sure everyone gets a fair turn without crashing the whole system.

Throttling is a softer version of the same idea. Instead of cutting someone off completely when they hit their limit, throttling slows them down. Think of it like a speed governor on a car: you can go fast, but not dangerously fast.

02 — Why It Matters

Keeping the Internet Fair and Online

Without rate limiting, one person with a fast script could crowd out everyone else. A single automated program can send thousands of requests per second — that's like one person trying to take all 50 coffee shop seats at once.

Rate limiting protects three things: the service stays online so everyone can use it, legitimate users get fair access without slowdowns, and costs stay predictable because the server isn't overwhelmed.

💡 Key Insight

Rate limiting isn't about blocking users — it's about making sure the internet works for everyone. Even the biggest companies like Google and AWS use rate limits to keep their services stable during traffic spikes.

03 — How It Works

The Three Main Ways to Limit

Servers track how many requests come in using a few different methods. Here are the most common ones:

Fixed Window — The server picks a time window (like one minute) and counts requests inside it. When the window resets, the count starts over. Simple to understand, but can have a "boundary burst" problem where requests spike at the exact moment a window resets.
Token Bucket — Think of a bucket that holds tokens. Each request uses one token. Tokens refill at a steady rate (say, 10 per minute). You can burst up to the bucket's max, but then you wait for refills. This is the most common approach for API rate limiting.
Leaky Bucket — Requests flow through like water through a leaky bucket — at a constant rate no matter how fast they arrive. Extra requests queue up or get dropped. Great for smoothing out traffic spikes.

When a user hits their limit, the server usually sends back a special code: 429 Too Many Requests. This tells the calling app or browser: "Slow down, come back later."

04 — Practical Example

A Simple Rate Limiter in Code

Here's a tiny JavaScript example that shows the basic idea of token bucket rate limiting. Every time you call makeRequest(), it checks if there's a token available.

rate-limiter.js

// Token bucket rate limiter
// Allows 5 requests, refills 1 token every second
const bucket = {
  tokens: 5,
  maxTokens: 5,
  refillRate: 1,  // per second
  lastRefill: Date.now()
};

function refillBucket() {
  const now = Date.now();
  const secondsPassed = (now - bucket.lastRefill) / 1000;
  bucket.tokens = Math.min(
    bucket.maxTokens,
    bucket.tokens + (secondsPassed * bucket.refillRate)
  );
  bucket.lastRefill = now;
}

function makeRequest() {
  refillBucket();
  if (bucket.tokens < 1) {
    console.log("⛔ Rate limited! Wait a moment.");
    return false;
  }
  bucket.tokens--;
  console.log("✅ Request sent! Tokens left:", bucket.tokens);
  return true;
}

// Try making 7 requests
for (let i = 0; i < 7; i++) {
  makeRequest();
}

The output would look like this:

Output

✅ Request sent! Tokens left: 4
✅ Request sent! Tokens left: 3
✅ Request sent! Tokens left: 2
✅ Request sent! Tokens left: 1
✅ Request sent! Tokens left: 0
⛔ Rate limited! Wait a moment.
⛔ Rate limited! Wait a moment.

05 — Test Yourself

Knowledge Check

Test what you learned with this quick quiz.

Quick Quiz — 3 Questions

Question 1

What does a server send back when you hit a rate limit?

Question 2

What is the main difference between rate limiting and throttling?

Question 3

Why does a website use rate limiting?