Ollama

Download and run open LLMs locally from your terminal

github.com/ollama/ollama★ 175k ollama.com

Overview

Ollama is a tool for downloading and running open large language models on your own computer. You install it on macOS, Windows, or Linux, then pull and chat with a model straight from the terminal with a single command.

It is aimed at developers who want to run models locally instead of calling a hosted service. Beyond the CLI, Ollama exposes a REST API on localhost and ships official Python and JavaScript libraries, so you can wire local models into your own apps and scripts.

As a local runtime, Ollama handles model downloads, serving, and the request loop for you, and connects to existing coding tools and agents such as Claude Code, Codex, and Copilot CLI through its launch integrations.

What it does

One-line install script for macOS, Windows, and Linux, plus an official Docker image
Run any model from the library with a single `ollama run` command
Built-in REST API on http://localhost:11434 for running and managing models
Official Python (`pip install ollama`) and JavaScript (`npm i ollama`) client libraries
Launch integrations for coding tools and agents like Claude Code, Codex, and Copilot CLI
Built on the llama.cpp backend for local model inference

Getting started

Install Ollama, then pull and chat with a model from the terminal. The same models are reachable over a local REST API and the Python and JavaScript libraries.

Install Ollama

Run the install script on macOS or Linux. On Windows, use the PowerShell command instead.

bashbash

curl -fsSL https://ollama.com/install.sh | sh

Run and chat with a model

Pull a model from the library and start chatting in the terminal.

bashbash

ollama run gemma4

Call the REST API

Ollama serves a local REST API on port 11434 for running models from your own apps.

bashbash

curl http://localhost:11434/api/chat -d '{
  "model": "gemma4",
  "messages": [{
    "role": "user",
    "content": "Why is the sky blue?"
  }],
  "stream": false
}'

Use the Python library

Install the official client and send a chat request from Python.

pythonpython

pip install ollama

from ollama import chat

response = chat(model='gemma4', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response.message.content)

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Run open LLMs locally for privacy or offline work, without sending data to a hosted API
Add a local model backend to a Python or JavaScript app through the official libraries
Connect Ollama to coding tools and agents such as Claude Code, Codex, or Copilot CLI
Prototype and test prompts against different models from the library before committing to a provider

How Ollama compares

Ollama alongside other open-source local runtimes tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
Ollama	★ 175k	Download and run open LLMs locally from your terminal
llama.cpp	★ 117k	A C/C++ inference engine that runs LLMs in the GGUF format on CPUs, Apple Silicon, and GPUs with low memory use.
GPT4All	★ 77.4k	GPT4All is a free desktop app and Python client that runs large language models locally on your own computer, with no API calls or GPU required.
LocalAI	★ 47k	A self-hosted server that exposes an OpenAI-compatible API for running text, vision, voice, and image models on local hardware.
Jan	★ 43.1k	An open-source desktop app that runs LLMs fully offline as a ChatGPT-style assistant on your own computer.
llamafile	★ 25k	A Mozilla project that packages a model and its runtime into one executable file you can copy and run on any OS.
MLC LLM	★ 22.8k	A machine-learning compiler that builds and runs LLMs across browsers, phones, and desktops using TVM-based code generation.
KTransformers	★ 17.3k	A framework for running large Mixture-of-Experts models locally by splitting work between CPU and GPU to fit limited VRAM.

// Overview

// What it does

// Getting started

Install Ollama

Run and chat with a model

Call the REST API

Use the Python library

// When to use it

// How Ollama compares

Overview

What it does

Getting started

When to use it

How Ollama compares