AI & Agents

How Agents Handle Unexpected Errors

Q: What is an "unexpected error" for an AI agent?

A problem the agent was not prepared to handle

Q: Why is the "Remember" step so important?

It stops the agent from making the same mistake twice

What happens when an AI agent runs into a problem it didn't see coming — and how good ones recover.

Scroll to start

01 — The Concept

Errors Are Inevitable

An AI agent is a helper that takes actions on its own — it opens pages, reads files, calls tools, and answers questions. But the real world is messy. Websites go down. Files are missing. Answers come back empty. A typo in a command can break a whole step. These surprises are called unexpected errors.

An unexpected error is anything the agent was not prepared for. It can be a tiny hiccup (the page loaded a second late) or a big wall (the API you needed is offline). The point is: the agent didn't plan for it.

A well-built agent doesn't just crash or give up. It notices the problem, thinks about what to do, and picks the best next move — like a person who runs into a closed door and quickly looks for another way in.

02 — Why It Matters

Errors Decide If a Tool Is Trustworthy

Agents that crash on the first problem feel like toys. Agents that recover feel like real tools you can rely on. The difference is almost always in how they handle errors.

Think about a customer service bot. A bad one says "Something went wrong, try again later" and gives up. A good one says "I couldn't reach the order system, but let me try a different way" — and gets you an answer. The first one wastes your time. The second one earns your trust.

For builders, error handling is also a safety issue. An agent that doesn't know how to fail safely can take the wrong action, repeat a mistake forever, or burn through your money retrying a broken task. Smart error handling is what keeps a runaway agent from running off the rails.

💡 Key Insight

Most "AI agent failures" you hear about are not really AI failures — they're missing error handling. The model did its best. The system around it just had no plan for what to do when things went wrong.

03 — How It Works

The Error Recovery Loop

When an agent hits an unexpected error, a good one follows a simple recovery loop. It tries, watches, thinks, and adjusts — usually in just a few seconds.

The Recovery Loop

⚙️

Try

Run the action

→

👀

Notice

Catch the error

→

💬

Decide

Retry, change, or ask

→

🎯

Remember

Save what you learned

↺ try again

Each step matters:

Try — take the action you planned.
Notice — check the result. Did it work? Did it fail? What kind of failure?
Decide — pick a recovery move. Common options: retry the same thing, change approach and try something else, ask the human for help, or stop and report.
Remember — write down what happened so you don't make the same mistake twice.

Skipping the Remember step is the most common mistake. An agent that doesn't learn from errors will hit the same wall over and over.

04 — Practical Example

A Simple Retry Loop in Python

Here's a tiny example showing the recovery loop in code. The function tries to fetch a webpage. If the page is down, it waits a moment and tries again — up to three times. If it still fails, it gives up gracefully and reports the problem instead of crashing.

agent_retry.py

import requests
import time

def fetch_page(url, tries=3):
    # Step 1: Try the action
    for attempt in range(1, tries + 1):
        try:
            response = requests.get(url, timeout=5)

            # Step 2: Notice — did it work?
            if response.status_code == 200:
                return response.text

            print(f"Attempt {attempt}: got status {response.status_code}")

        except Exception as e:
            # Step 2: Notice — something went wrong
            print(f"Attempt {attempt} failed: {e}")

        # Step 3: Decide — wait, then retry
        if attempt < tries:
            print("Waiting 2 seconds before retry...")
            time.sleep(2)

    # Step 4: Give up gracefully and report
    raise RuntimeError(f"Could not fetch {url} after {tries} tries")

This is the heart of the recovery loop in just a few lines: try, notice the error, decide to wait and try again, and stop cleanly if it still doesn't work. Real agents do the same thing — just with more steps, more tools, and a way to remember the mistake for next time.

05 — Test Yourself

Knowledge Check

Test what you learned with this quick quiz.

Quick Quiz — 3 Questions

Question 1

What is an "unexpected error" for an AI agent?

Question 2

What are the four steps in the error recovery loop?

Question 3

Why is the "Remember" step so important?

Errors Are Inevitable

Errors Decide If a Tool Is Trustworthy

💡 Key Insight

The Error Recovery Loop

A Simple Retry Loop in Python

Knowledge Check

Quick Quiz — 3 Questions

You crushed it!