Tools & Infrastructure

How to Run AI Models Locally With Ollama

Skip the cloud fees and keep your data on your own machine with free, open-source AI.

Scroll to start

What Is Ollama and Why Does It Exist?

Every time you use ChatGPT, Claude, or Gemini, your messages travel to someone else's computer — a big data center somewhere — and come back. That's called "the cloud." It works fine, but it costs money and means your conversations are stored on servers you don't control.

Ollama changes that. It's a free, open-source tool that lets you download and run AI models directly on your own computer. No internet needed after the first download. No monthly fees. No data leaving your machine. You download a model once, and it runs forever on your own hardware.

Think of it like downloading a video instead of streaming it. Once it's on your computer, you own it and can watch it whenever you want — without an internet connection or a subscription.

Privacy, Speed, and Zero Cost

There are three big reasons developers and everyday users are switching to local AI tools like Ollama.

Privacy: When you run a model locally, your data never leaves your computer. For developers building products that handle sensitive information — medical notes, legal documents, business strategies — this is a huge deal. You don't have to worry about where your data goes or who can see it.

Speed: For small tasks, local models can feel just as fast — sometimes faster — than sending requests to a remote server, especially if you have a good graphics card (GPU) in your machine.

Cost: Cloud AI services charge per message or per month. Ollama is completely free. The only cost is the electricity to run your computer, which you were paying for anyway.

💡 Key Insight

You don't need a supercomputer to run local AI. A modern laptop with 16GB of RAM and a recent Apple MacBook or gaming PC can run smaller AI models just fine. The key is picking the right model size for your hardware.

Getting Up and Running in 3 Steps

Getting Ollama running on your computer is surprisingly simple. Here's how it works:

The Ollama Setup Loop
⬇️
Download
Install Ollama on your machine
🤖
Pull Model
Download an AI model from the library
💬
Run & Chat
Start the model and talk to it

After you install Ollama, you can open your terminal (a text-based command screen) and type simple commands. Want to run a chatbot? One line of code starts it. Want a different model? You swap it with another one-line command.

Ollama comes with a built-in model library — a list of free models you can download. Popular choices include Llama 3 (from Meta), Mistral, and Phi — each with different strengths and sizes. Smaller models (around 3-7 billion parameters) run on modest hardware. Larger models (70+ billion parameters) need a powerful GPU but are smarter.

Running Your First Local AI Model

Here's exactly what it looks like to run a local AI model with Ollama. Open your terminal (Mac: Terminal app, Windows: Command Prompt or PowerShell) and run these commands:

Step 1: Install Ollama (do once)
# Download Ollama from ollama.com
# Or use a package manager:

brew install ollama          # macOS (with Homebrew)
curl -fsSL https://ollama.com/install.sh | sh  # Linux / WSL
Step 2: Pull a model (do once per model)
# Download the Llama 3 model (~4.7GB)
ollama pull llama3

# Or try Mistral (~4GB) — good for coding tasks
ollama pull mistral

# Or a smaller model for quick tasks
ollama pull phi3
Step 3: Start chatting
# Start an interactive chat session
ollama run llama3

# Ask it anything — it runs locally on your machine!
# Type 'exit' to quit

That's it — three commands and you're chatting with an AI running entirely on your own computer. No internet required after the initial downloads.

Knowledge Check

Test what you learned with this quick quiz.

Quick Quiz — 3 Questions

Question 1
What does Ollama let you do that cloud AI services don't?
Question 2
Which of these is a key benefit of running AI locally?
Question 3
What is a good starting model to try on a regular laptop?
🏆

You crushed it!

Perfect score on this module.