In plain English
A knowledge cutoff is the date after which a language model has, in effect, read nothing. The model learned everything it knows during a one-time training run on a giant pile of text. That pile was frozen on a particular day. Anything published after that day simply wasn't in the books it studied — so out of the box, the model has no idea it happened.
Think of a brilliant student who was locked in a library, read every book up to a certain morning, and then walked out and never read anything again. Ask them about history, math, or how to write a recipe and they're sharp. Ask them who won last night's game and they'll either say I don't know or, worse, confidently make something up. The model is that student. The day they left the library is the knowledge cutoff.
This is a direct consequence of how models are built. If you're new to the bigger picture, the companion piece How Do LLMs Actually Work? Next-Token Prediction Explained covers why training is a separate, one-shot event rather than a live feed.
Why it matters
The cutoff is the single biggest reason a smart-sounding model gives you a stale or wrong answer about anything recent. It quietly shapes whether you can trust a reply, and it's the root cause behind a surprising number of confident mistakes.
- Recent events — election results, product launches, sports scores, who's CEO now. The model's snapshot may be months or years out of date.
- Fast-moving facts — software versions, API changes, library names, current prices. A model can confidently recommend a function that was renamed after its cutoff.
- Itself — ask a model about the newest models (including newer versions of itself) and it often can't, because they were announced after it stopped reading.
- Made-up details — when a model doesn't know, it may guess in fluent, plausible prose instead of admitting the gap. That's the link between cutoffs and hallucination.
Who cares? Anyone building on top of a model. If you're writing a support bot, a research assistant, or a coding tool, the cutoff decides whether you can rely on the model's memory or whether you must feed it fresh data yourself. Get this wrong and your app cheerfully serves last year's facts as if they were today's.
How it works
A model's life has two completely separate phases. Training happens once, over weeks or months on enormous clusters of GPUs, and bakes patterns from the training data into the model's weights. Inference is every time you actually chat with it afterward — fast, cheap, and read-only. The weights don't change when you talk to it. Nothing you say or that happens in the world gets written back into the model.
So the cutoff is just the latest date well represented in the frozen training pile. There's a subtlety, though: the reported cutoff a vendor publishes and the model's effective cutoff aren't always the same. Web crawls mix old and new pages, and de-duplication is imperfect, so a model's knowledge thins out and grows patchy in the months right before its official date. Researchers in the paper Dated Data: Tracing Knowledge Cutoffs in Large Language Models showed these effective cutoffs often differ from the reported ones — treat the published date as a soft boundary, not a hard guarantee.
Current cutoffs (as of mid-2026)
Cutoffs move with every new model generation. Here's a verified snapshot of leading families as of mid-2026 — always confirm against the official model card, because these numbers churn fast.
| Model | Reported training cutoff |
|---|---|
| Claude Opus 4.8 / 4.7 (Anthropic) | January 2026 |
| Claude Sonnet 4.6 / Opus 4.6 | August 2025 |
| Claude Haiku 4.5 | July 2025 |
| GPT-5.5 (OpenAI) | December 2025 |
| GPT-5.4 | August 2025 |
| Gemini 3 / 3.1 Pro (Google) | January 2025 |
Two things jump out. First, even the freshest models lag months behind the calendar — training, fine-tuning, and safety testing take time, so a model released in spring 2026 may only 'know' up to late 2025. Second, models in the same family can have different cutoffs: smaller, faster variants are often trained on a slightly earlier snapshot. Never assume two models from one vendor share a date.
How chatbots know today's news anyway
If the weights are frozen, how does ChatGPT tell you who won an election yesterday? It doesn't remember it — a tool fetched it and slipped the text into the conversation, so the model can read it the same way it reads your question. The cutoff is untouched; the model is just reading a fresh page you (or a tool) handed it.
- Grammar, reasoning, coding patterns
- Stable facts (math, history)
- Everything up to the cutoff
- Cannot be updated without retraining
- Today's news and prices
- Your private docs and data
- Anything after the cutoff
- Injected fresh into each request
There are three common ways to smuggle the present into a model:
- Web search / browsing — the model calls a search tool, the system fetches live pages, and their text is added to the prompt. Google calls this Search Grounding; OpenAI and Anthropic ship built-in web search tools. This is a flavor of function calling where one of the tools happens to be a search engine.
- Retrieval-Augmented Generation (RAG) — instead of the open web, you pull relevant chunks from your knowledge base (docs, tickets, a wiki) and paste them in. See What Is RAG? for the full pattern.
- Just paste it yourself — copy the article, contract, or changelog straight into the chat. The simplest workaround of all, and often the most reliable.
All three share one idea: the answer comes from text in the context window, not from the model's baked-in memory. That's why a model with a 2025 cutoff can still discuss a 2026 event — as long as someone put the 2026 text in front of it.
Common pitfalls
- Assuming 'it has internet.' A raw API call has no web access. Plain models answer purely from training memory unless you enable a search tool or paste in data. The friendly chatbot UI may search automatically; the bare API does not.
- Trusting confident dates. A model will happily state a recent fact in fluent prose even when it's guessing. Confidence is not currency — verify anything time-sensitive.
- Forgetting retrieved ≠ understood. Even with search on, the model can pick a weak source, blend it with stale training assumptions, or quote it badly. Retrieval reduces stale answers; it doesn't eliminate mistakes.
- Ignoring the soft edge. Knowledge gets sparse in the months just before the cutoff. A March cutoff doesn't mean rich coverage of February — treat the last stretch as shaky.
Going deeper
Once you accept that a cutoff is a soft, fuzzy boundary rather than a clean wall, a few sharper ideas follow. The most important: never hard-code a single date as 'the truth' about what a model knows.
Effective vs. reported cutoff
The Dated Data researchers probed models by checking which versions of frequently-edited documents they had memorized. The result: a model's true knowledge boundary can sit before the advertised date, and it varies by topic and source. Web pages crawled in a 'new' dump often contain old snapshots; de-duplication misses near-duplicates. Practically, this means you should test a model on your domain's recent facts rather than trusting the headline number.
Why not just retrain constantly?
Full pretraining is staggeringly expensive — the reason scaling laws and GPU costs dominate the conversation. You can't retrain nightly to absorb the news. That's exactly why the industry leaned into retrieval and tools instead of perpetual retraining: it's far cheaper to fetch the present at inference time than to rebake it into the weights.
Always tell the model the date
A subtle trick: a model has no built-in clock. If you don't tell it today's date in the system prompt, it may assume 'now' is somewhere around its cutoff and reason about timing incorrectly — calling a 2026 event 'upcoming' or 'hypothetical.' Stamping the current date into the prompt is a one-line fix that prevents a class of temporal confusion.
from datetime import date
# A model has no clock. Hand it the date and the fresh facts.
system = (
f"Today is {date.today().isoformat()}.\n"
"Your training data ends earlier than today. "
"If a question depends on recent events, rely ONLY on the "
"context provided below — do not answer from memory.\n"
)
retrieved = fetch_live_context("latest model releases") # search or RAG
prompt = system + f"\n[CONTEXT]\n{retrieved}\n\n[QUESTION]\nWhat shipped this week?"
# The cutoff is unchanged; the answer now comes from `retrieved`,
# not from the model's frozen memory.
answer = model.generate(prompt)FAQ
What is a knowledge cutoff in an LLM?
It's the date after which a model has, in effect, read nothing. The model learned from a frozen pile of text during a one-time training run, so it has no built-in knowledge of anything published after that date — unless a tool fetches it and adds it to the conversation.
Why doesn't ChatGPT know about recent events?
Because its training data was frozen on a date in the past. The underlying model only 'remembers' what it read before then. When the chatbot does report recent news, a separate web-search tool fetched the page and pasted it into the prompt — the model itself didn't learn it.
What are the current knowledge cutoff dates as of 2026?
As of mid-2026, Claude Opus 4.8 and 4.7 report a January 2026 cutoff, GPT-5.5 reports December 2025, and Gemini 3 / 3.1 Pro report January 2025. These move with every release and can differ between models in the same family, so always confirm against the official model card.
Can I trust a model when it tells me its own cutoff date?
No. Models are unreliable at self-reporting their cutoff and may state a wrong, hedged, or hallucinated date. Check the vendor's official documentation or model card instead of asking the model directly.
How do I get a model to use current information?
Give it the data instead of relying on memory: enable a web-search tool, use Retrieval-Augmented Generation (RAG) over your own documents, or simply paste the relevant text into the prompt. Also tell the model today's date, since it has no built-in clock.
Is the reported cutoff date exact?
Treat it as a soft boundary, not a hard line. Research shows a model's effective cutoff often sits earlier than the advertised one, and its knowledge grows sparse and patchy in the months right before the official date. Test recent facts in your own domain rather than trusting the headline number.