Tools & Infrastructure

Rate Limiting and Throttling Explained

How websites stop one user from taking all the power.

Scroll to start

Two Guards at the Door

Imagine a coffee shop. If one person camps at the counter ordering 50 drinks, everyone else waits. To stop that, the barista sets a rule: one person gets five drinks max. That's like rate limiting — a hard cap on how many requests a user can make.

Now imagine the same shop on a busy morning. Instead of saying "no more," the barista slows down and makes each drink a little slower so everyone gets served fairly. That's like throttling — slowing things down instead of stopping them completely.

Rate limiting and throttling are two ways websites and apps keep things fair and stable. They make sure no single user can hog all the computer power, crash the site, or abuse a free service.

Why Fairness and Stability Matter

When a website has thousands of people using it at once, every request uses some of the computer's brainpower. If one person sends a million requests per second, the server gets overloaded and crashes for everyone.

Rate limiting and throttling protect three things:

  • Stability — The site stays up and running for all users
  • Fairness — One person can't jump ahead of thousands of others
  • Cost control — Running servers costs money, so companies need to limit how much work gets done

💡 Key Insight

Most free API plans limit you to 100 requests per minute. Hit that cap, and you're blocked until the next minute resets. It's not a punishment — it's the website's way of making sure you're not using more than your share.

The Two Strategies in Action

Here's how each approach works in practice:

🚫 Rate Limiting

  • Hard cutoff — requests are blocked entirely when the limit is reached
  • Common for API calls — "100 requests per minute max"
  • Usually returns an error code like 429 (Too Many Requests)
  • Like a movie theater enforcing a strict capacity limit

⏳ Throttling

  • Soft slowdown — requests go through but take longer
  • Common for uploads and data streams
  • Keeps service running, just at a reduced pace
  • Like a highway reducing speed from 100 to 60 during a storm

Both approaches track requests over time. A server watches each user or IP address, counts how many requests come in, and makes a decision: block it, slow it down, or let it pass.

A Simple Rate Limit in Code

Here's what rate limiting looks like as code. This tiny script tracks how many times a user has called an API and blocks them if they go over the limit:

rate-limiter.js
// Track requests for each user
const requests = {};

function checkRateLimit(userId) {
  const LIMIT = 100;       // max 100 requests
  const WINDOW = 60 * 1000; // per 60 seconds

  // Reset counter if window has passed
  const now = Date.now();
  if (!requests[userId] || now - requests[userId].start > WINDOW) {
    requests[userId] = { count: 0, start: now };
  }

  // Block if over limit
  if (requests[userId].count >= LIMIT) {
    return { allowed: false, retryAfter: 30 };
  }

  requests[userId].count++;
  return { allowed: true, remaining: LIMIT - requests[userId].count };
}

// Test it
const result = checkRateLimit("user_42");
console.log(result); // { allowed: true, remaining: 99 }

Every time a user makes a request, the code checks the counter. If they're under 100, the request goes through. If they hit 100 within 60 seconds, the server responds with allowed: false — and the user sees a "429 Too Many Requests" error.

Knowledge Check

Test what you learned with this quick quiz.

Quick Quiz — 3 Questions

Question 1
What does rate limiting do when a user hits their request limit?
Question 2
Which of these is the best reason websites use rate limiting?
Question 3
What's the main difference between rate limiting and throttling?