AI Development

What Is Regex — and How Do Patterns Find Text?

Learn how to write patterns that search, match, and extract exactly what you need from any block of text.

Scroll to start

Patterns That Hunt Through Text

Imagine you have a huge document and you want to find every phone number in it. You could read through the whole thing manually — or you could use a pattern. A phone number in the US looks like this: three digits, a dash, three digits, a dash, four digits. Regex lets you describe that pattern once, and a computer applies it everywhere.

Regex — short for Regular Expression — is a way to describe what you're looking for in text. Instead of searching for one exact word, you write a tiny pattern language that describes the shape of the text itself. Letters, numbers, symbols, spaces — regex has a symbol for almost every kind of text piece.

The simplest regex is just plain text. If you search for cat, it matches any word containing those three letters in that order. But regex gets powerful when you use its special symbols. Want to find any digit? Use \d. Any space? Use \s. The start of a line? ^. The end? $.

The Skill Behind Every Smart Search

Every time an AI pulls a date out of an email, or a tool finds all the URLs in a document, or a form checks that a phone number is actually a phone number — that's regex working behind the scenes. It's one of the most invisible but powerful skills in text processing.

If you work with data, emails, logs, or user input, regex is one of the fastest skills you can pick up. A task that would take an hour of copy-pasting can be done in seconds. Programmers and AI tools both use it constantly to extract and validate information.

💡 Key Insight

Regex isn't about memorizing every symbol — it's about understanding the idea: describing the shape of what you want, then letting the computer do the searching. Once that clicks, the rest is practice.

Reading Regex Like a Mini Language

Regex uses a handful of special characters to represent different kinds of text. Here's the most common ones, explained simply:

🔢

Character Classes

\d = any digit (0–9)
\w = any letter or number
\s = any space or tab
. = absolutely any character

🔁

Repeats

* = zero or more times
+ = one or more times
? = optional (zero or one)
{3} = exactly 3 times

📍

Position

^ = start of a line
$ = end of a line
\b = word boundary

You can combine these. \d{3}-\d{3}-\d{4} describes a US phone number exactly — three digits, a dash, three digits, a dash, four digits. ^\w+ finds the first word on any line. \. matches an actual period (the backslash "escapes" the dot so it's not read as "any character").

Finding Emails and Phone Numbers

Let's say you have a document full of mixed-up contact info. You want to pull out every email and phone number. In JavaScript, you can do this with regex in just a few lines:

extract-contacts.js
// Your messy data
const text = "Call 555-123-4567 or email jane@example.com ASAP.";

// \b means word boundary — keeps email from bleeding into nearby text
const emailRegex = /\b[\w.-]+@[\w.-]+\.\w+\b/g;
const phoneRegex = /\d{3}-\d{3}-\d{4}/g;

// .match() applies the pattern and returns what it finds
console.log(text.match(emailRegex));
// Output: ["jane@example.com"]

console.log(text.match(phoneRegex));
// Output: ["555-123-4567"]

Breaking down the email pattern: [\w.-]+ means "one or more word characters, dots, or dashes." The @ is literal. Then [\w.-]+ again for the domain, \. for the dot, and \w+ for the extension like "com." The g at the end means "find all matches, not just the first one."

Knowledge Check

Test what you learned with this quick quiz.

Quick Quiz — 3 Questions

Question 1
What does the regex pattern \d{3}-\d{4} match?
Question 2
What does the . (dot) mean in a regex pattern?
Question 3
What does the + symbol do in a regex pattern?