Overview
Jan is an open-source desktop app that lets you download and run large language models directly on your own machine. It works like a ChatGPT-style assistant, but the models can run fully offline, so your chats and data stay on your computer.
It is aimed at developers and privacy-minded users who want local control over which models they run. You can pull open models such as Llama, Gemma, Qwen, and GPT-oss from Hugging Face, or connect to cloud providers like OpenAI, Anthropic, Mistral, and Groq when you prefer.
As a local runtime, Jan also exposes an OpenAI-compatible API server at localhost:1337, so other applications can talk to your local models using the same request format they already use for hosted APIs.
What it does
- Download and run open LLMs (Llama, Gemma, Qwen, GPT-oss) from Hugging Face
- OpenAI-compatible local API server at localhost:1337 for other apps
- Optional cloud model connections: OpenAI, Anthropic, Mistral, Groq, MiniMax, and others
- Custom assistants for specific tasks
- Model Context Protocol (MCP) integration for agentic workflows
- Runs locally for privacy; installers for Windows, macOS, and Linux
Getting started
The quickest way to start is to download the prebuilt desktop app for your operating system; you can also build from source if you prefer.
Download the app
Get the installer for your platform from jan.ai or GitHub Releases (Windows .exe, macOS .dmg, or Linux .deb/.AppImage), then install and open it.
Download a model and chat
Inside the app, download an open model such as Llama, Gemma, or Qwen from Hugging Face, then start chatting with it locally. No code required.
Use the local API (optional)
Jan exposes an OpenAI-compatible server at localhost:1337, so existing OpenAI-style clients can point at your local models.
http://localhost:1337Build from source (optional)
Requires Node.js >= 20, Yarn >= 4.5.3, Make, and Rust (for Tauri). The dev target installs dependencies, builds, and launches the app.
git clone https://github.com/janhq/jan
cd jan
make devCommands and code are distilled from the project's own documentation — always check the official repo for the latest.
When to use it
- Run a private, offline AI assistant without sending chats to a cloud service
- Test and compare open models like Llama, Gemma, and Qwen on your own hardware
- Serve a local OpenAI-compatible endpoint for your own apps and scripts
- Switch between local models and cloud providers (OpenAI, Anthropic, Groq) from one app
How Jan compares
Jan alongside other open-source local runtimes tools AI/TLDR tracks, ranked by GitHub stars.
| Tool | Stars | What it does |
|---|---|---|
| Ollama | ★ 175k | A developer-friendly tool that downloads and runs local LLMs from the terminal with a built-in OpenAI-compatible API. |
| llama.cpp | ★ 117k | A C/C++ inference engine that runs LLMs in the GGUF format on CPUs, Apple Silicon, and GPUs with low memory use. |
| GPT4All | ★ 77.4k | GPT4All is a free desktop app and Python client that runs large language models locally on your own computer, with no API calls or GPU required. |
| LocalAI | ★ 47k | A self-hosted server that exposes an OpenAI-compatible API for running text, vision, voice, and image models on local hardware. |
| Jan | ★ 43.1k | Run open-source LLMs offline on your own computer, like a private ChatGPT |
| llamafile | ★ 25k | A Mozilla project that packages a model and its runtime into one executable file you can copy and run on any OS. |
| MLC LLM | ★ 22.8k | A machine-learning compiler that builds and runs LLMs across browsers, phones, and desktops using TVM-based code generation. |
| KTransformers | ★ 17.3k | A framework for running large Mixture-of-Experts models locally by splitting work between CPU and GPU to fit limited VRAM. |