AI/TLDR — New AI Releases Daily: Models, Tools, Repos & PapersA high-volume feed of new AI releases — models, open-source repos, developer tools, papers, datasets, and benchmarks — refreshed every 2 hours. Each release is explained in plain English so you actually understand what shipped.This site uses JavaScript to render the interactive feed. Enable JavaScript, or visit the source repo for the raw JSON.

AI/TLDR

AI/TLDR — every new AI model, tool, repo & paper

The latest AI releases, refreshed every 2 hours and explained in plain English.

What AI shipped today?

In the last 24 hours AI/TLDR tracked 6 new AI releases, including Sam Witteveen: 'MiniCPM5 - The 1B Cognitive Core?', Generals: Zero Hour on Mac & iOS — Ammaar Reshi's Claude Fable port and Hugging Face Transformers 5.13.0 — nine new architectures and unified HfExporter. AI/TLDR is an AI release tracker that follows new AI models, open-source tools, papers, datasets and benchmarks — refreshed every 2 hours from verified primary sources and explained in plain English.

AI Release Index — live stats on AI releases · Learn AI

Sam Witteveen: 'MiniCPM5 - The 1B Cognitive Core?'Sam Witteveen · 2026-07-05 · video
Sam Witteveen tests MiniCPM5-1B, OpenBMB's on-device model that Artificial Analysis crowned the leading 1B open-weight LLM. He walks through the hybrid-reasoning template and its agentic tool use versus larger 2B rivals.
Generals: Zero Hour on Mac & iOS — Ammaar Reshi's Claude Fable portAmmaar Reshi · 2026-07-04 · showcase
Ammaar Reshi shipped a native macOS, iPhone, and iPad port of Command & Conquer Generals: Zero Hour built with Claude Code on the Fable model, with the 2003 engine rendering through DXVK and MoltenVK to Metal.
Hugging Face Transformers 5.13.0 — nine new architectures and unified HfExporterHugging Face · 2026-07-03 · repo
Hugging Face Transformers 5.13.0 lands nine new model architectures — including Kimi K2.5–K2.7, MiMo-V2-Flash, Zyphra ZAYA, VideoPrism, RADIO, and MiniCPM3 — and introduces HfExporter, a unified export API covering PyTorch, ONNX, and ExecuTorch.
Chrome DevTools MCP 1.5.0 — heap snapshot comparison lands for coding agentsChrome DevTools · 2026-07-03 · repo
Chrome DevTools MCP 1.5.0 ships two new memory-analysis tools for AI agents: get_heapsnapshot_duplicate_strings for spotting string-interning leaks, and heap-snapshot comparison for diffing memory between runs. Directory permissions were also hardened to 0o700.
Simon Willison — sqlite-utils 4.0rc2, mostly written by Claude FableSimon Willison · 2026-07-05 · article
Simon Willison shipped sqlite-utils 4.0rc2 with Claude Fable doing most of the work — 37 prompts, 34 commits, 30 files, $149.25 at unsubsidized API rates — and caught a data-loss bug in delete_where() during pre-release review.
Armin Ronacher — Better Models: Worse ToolsArmin Ronacher · 2026-07-04 · article
Armin Ronacher shows Claude Opus 4.8 and Claude Sonnet 5 emit malformed tool calls on schemas they weren't RL-trained against — appending invented JSON keys that Claude Code silently repairs but other harnesses reject.
Simon Willison — let Fable delegate coding tasks to cheaper modelsSimon Willison · 2026-07-03 · article
Simon Willison shows how Fable, running as the main Claude Code loop, can spawn subagents on Sonnet for substantive coding and Haiku for trivial edits — using Fable's judgement to route work and cut cost without losing quality.
Epoch AI — CVE severity spike after Claude Mythos PreviewEpoch AI · 2026-07-02 · article
Epoch AI shows serious CVE disclosures from 21 major vendors jumped to about 1,500 high- or critical-severity fixes in June 2026, more than 3.5 times the pre-Mythos monthly record.
Claude Enterprise — spend alerts, model entitlements, and admin APIAnthropic · 2026-07-02 · tool
Anthropic ships richer Claude Enterprise admin analytics, spend-threshold alerts at 75% and 90%, per-role model entitlements, and an Admin API for automating cost controls at scale.
Simon Willison — using DSPy to fix Datasette Agent's SQL promptsSimon Willison · 2026-07-02 · article
Simon Willison lets Claude Code run DSPy against Datasette Agent's SQL system prompt and finds one culprit: the schema listing skips column names, which pushes the model into guessing and error-retry loops.
Wafer runs GLM-5.2 on AMD MI355X — 2x cheaper inference than BlackwellWafer · 2026-07-03 · showcase
Wafer quantized Z.ai's 753B GLM-5.2 to MXFP4 and ran it on AMD MI355X with sglang. Result: 2,626 tokens/sec per node, 213 tokens/sec single-stream, ~80% of Blackwell B200 throughput at 2.75x lower GPU cost.
Two Minute Papers: 'They Said This Will Never Run In Real Time'Two Minute Papers · 2026-07-03 · video
Two Minute Papers walks through JGS2, a GPU solver for elastodynamic simulation that hits near-Newton convergence while staying parallel — the authors report 50 to 100 times faster convergence than existing GPU methods.
Open Source AI Gap Map — Current AI charts 421 open-source AI products in one placeCurrent AI · 2026-07-02 · resource
Current AI's Gap Map v0.1 indexes 421 open-source AI products from 228 organizations across models, tools, datasets, and hardware, scoring each on openness, capability, and adoption.
local-llm — jamesob's field guide to running GLM-5.2 and Qwen3.6 on your deskJames O'Beirne · 2026-07-03 · tutorial
James O'Beirne's local-llm repo documents two full builds for running SOTA open-weight models at home: a ~$2K dual-RTX-3090 rig for Qwen3.6-27B and a ~$40K quad-RTX-6000-Pro rig for GLM-5.2 594B.
pxpipe — proxy that renders Claude Fable 5 context as PNGs to cut input tokenspxpipe · 2026-07-03 · tool
pxpipe is a localhost proxy for Anthropic's API that rewrites bulky system prompts, tool docs, and older chat turns as PNG images the model reads with vision. Cuts input tokens 59–70%; MIT-licensed, runs with npx.
EdgeBench — ByteDance's 134-task long-horizon agent benchmarkByteDance Seed · 2026-07-02 · benchmark
EdgeBench is a ByteDance Seed benchmark with 134 real-world tasks that each run 12+ hours to test how AI agents learn from executable environments over long horizons.
llm-coding-agent 0.1a0 — Simon Willison's alpha Fable 5 coding agentSimon Willison · 2026-07-02 · tool
Simon Willison released llm-coding-agent 0.1a0, a Python coding agent built on his LLM library that runs against Claude Fable 5 and exposes six tools for reading files, editing code, and running shell commands.
Program-as-Weights — compile English into LoRA adapters for a local 0.6B modelWaterloo + Cornell + Harvard · 2026-07-02 · paper
Program-as-Weights (PAW) is a compiler that turns natural-language specs like 'repair broken JSON' into a LoRA adapter for a frozen 0.6B model. Runs at 30 tok/s on a MacBook M3 and matches Qwen3-32B prompting with roughly 50x less memory.
Safari MCP server — Apple lets AI agents drive Safari to debug websitesApple WebKit · 2026-07-01 · tool
Apple's Safari Technology Preview 247 ships an MCP server that gives coding agents 17 tools to inspect the DOM, capture screenshots, evaluate JavaScript, and drive a live Safari window. Runs locally, no data leaves the device.
Alibaba bans Claude Code — cites Chinese-timezone fingerprintingAlibaba · 2026-07-03 · ecosystem
Alibaba will block employees from using Claude Code from July 10 after developers found the tool checked for Asia/Shanghai and Asia/Urumqi timezones and Chinese-lab proxy strings. Anthropic says it was an anti-abuse experiment and rolled it back on July 1.
Cloudflare — separate AI crawler controls for Search, Agent, and Training botsCloudflare · 2026-07-01 · ecosystem
Cloudflare adds three separate switches for Search, Agent, and Training AI bots on every zone. Site owners can allow search crawlers while blocking training and agent traffic. New defaults on ad-supported pages take effect September 15, 2026.
Gemini Interactions API GA — Google's unified endpoint for models and agentsGoogle · 2026-07-01 · tool
Google's Interactions API leaves public beta as the default endpoint for both Gemini model inference and managed agents, with server-side state, background execution, and Deep Research on one interface.
Anthropic CJS — a 0-to-4 severity scale for AI cyber jailbreaksAnthropic · 2026-07-02 · resource
Anthropic and Project Glasswing partners publish a draft Cyber Jailbreak Severity scale, CJS 0 to CJS 4, plus a HackerOne bounty program that pays researchers who report jailbreaks against Claude Fable 5.
Claude Code 2.1.198 — subagents run in the background by defaultAnthropic · 2026-07-01 · tool
Claude Code 2.1.198 flips subagents to background-by-default so the main turn keeps working while they run, promotes Claude in Chrome to GA, and ships a /dataviz skill for chart and dashboard design.
AI Explained: 'Fable 5 vs GPT 5.6 Sol — The Early Results'AI Explained · 2026-07-02 · video
AI Explained publishes an early hands-on comparison of Claude Fable 5 and OpenAI's GPT-5.6 Sol days after Anthropic's global redeploy and OpenAI's limited preview.
Kimi K2.7-Code in GitHub Copilot — first open-weight model in the pickerGitHub · 2026-07-01 · tool
GitHub Copilot adds Moonshot AI's Kimi K2.7-Code to its model picker — the first open-weight option. It rolls out to Copilot Pro, Pro+, and Max today, with Business and Enterprise following in the coming weeks.
xAI Voice Agent Builder — no-code builder for production voice agentsxAI · 2026-07-01 · tool
xAI's Voice Agent Builder is a no-code platform for production voice agents on Grok Voice. It costs $0.05/min for agent audio plus $0.01/min for telephony on provisioned numbers, with sub-second latency and 25+ languages.
ZCode — Z.ai's official coding harness for GLM-5.2Z.ai · 2026-07-01 · tool
ZCode is Z.ai's first-party desktop coding agent for GLM-5.2. It ships Goals for long-running tasks, remote control from WeChat, Feishu, or Telegram, and native installers for macOS, Windows, and Linux.
Claude in Microsoft Foundry — Opus 4.8 and Haiku 4.5 go GA on AzureAnthropic · 2026-06-29 · ecosystem
Claude Opus 4.8 and Claude Haiku 4.5 are now generally available in Microsoft Foundry on Azure, billed through the customer's Microsoft Enterprise Agreement and running on NVIDIA GB300 Blackwell Ultra GPUs.
GeneBench-Pro — OpenAI's 129-problem computational-biology benchmarkOpenAI · 2026-06-30 · benchmark
GeneBench-Pro is a 129-problem benchmark from OpenAI that grades AI agents on messy, judgment-heavy computational biology. GPT-5.6 Sol Pro tops it at 31.5%; Claude Opus 4.8 lands second at 16.0%.
DeepSeek V4 gets peak-hour pricing — API doubles 9am–12pm and 2pm–6pm Beijing timeDeepSeek · 2026-06-29 · ecosystem
DeepSeek notified users on June 29 that when V4 ships mid-July, API rates for V4 Pro and V4 Flash will double during two Beijing peak windows (9–12 and 14–18) to relieve compute congestion. Off-peak rates hold at the May reductions.
Two Minute Papers: 'This New AI Model Changes Everything'Two Minute Papers · 2026-07-01 · video
Two Minute Papers walks through GLM-5.2, Z.ai's open-weight coding model with a 1M-token context window, framing it as the first open model to plausibly close the gap on frontier closed labs.
Claude Fable 5 redeployed — Anthropic ships globally after US lifts export controlsAnthropic · 2026-06-30 · ecosystem
Claude Fable 5 becomes available worldwide on July 1 after the US removes export controls issued in June. Anthropic adds a defense-in-depth stack that blocks the jailbreak that got the model paused in over 99% of cases.
Wes Roth: 'FABLE 5 IS BACK' — reacting to the Claude Fable 5 redeploymentWes Roth · 2026-07-01 · video
Wes Roth walks through Anthropic's June 30 announcement that US export controls on Claude Fable 5 are lifted and the model returns worldwide on July 1, 2026.
Leanstral 1.5 — Mistral's updated Lean 4 formal-proof modelMistral AI · 2026-07-02 · model
Leanstral 1.5 saturates miniF2F at 100%, solves 587/672 PutnamBench problems, and finds 5 previously unreported bugs across 57 open-source repos. Apache-2.0 weights on Hugging Face and a free labs-leanstral-1-5 API.
TabFM — Google's zero-shot foundation model for tabular dataGoogle Research · 2026-06-30 · model
TabFM is a Google Research foundation model that classifies and predicts on tabular data in a single forward pass, no per-task training. Weights are on Hugging Face, Apache-2.0 code is on GitHub, BigQuery hook next.
Gemini Omni Flash + Nano Banana 2 Lite — Google's new video and image modelsGoogle · 2026-06-30 · model
Google launches Gemini Omni Flash, a $0.10/sec video model with conversational editing, alongside Nano Banana 2 Lite, an image model that ships a result in 4 seconds at $0.034 each.
1littlecoder: 'Claude Sonnet 5 in 12 mins!'1littlecoder · 2026-06-30 · video
1littlecoder publishes a 12-minute walkthrough of Claude Sonnet 5, Anthropic's new agentic Sonnet that approaches Opus 4.8 quality at Sonnet pricing.
Claude Sonnet 5 — Anthropic's new agentic Sonnet at Opus-class qualityAnthropic · 2026-06-30 · model
Claude Sonnet 5 is Anthropic's most agentic Sonnet yet, with a 1M-token context and adaptive thinking. It targets Opus 4.8 quality at lower cost and is now the default for Free and Pro plans.
Claude Code is steganographically marking requests — hidden prompt fingerprintsThereallo · 2026-06-30 · article
Researcher Thereallo found that Claude Code silently rewrites its system prompt with steganographic markers when ANTHROPIC_BASE_URL is set, encoding proxy hostnames against an XOR-obfuscated competitor list.

Frequently asked questions

What is AI/TLDR?

AI/TLDR is a high-volume tracker of new AI releases — models, open-source repos, developer tools, papers, datasets, benchmarks and security findings — refreshed every 2 hours and explained in plain English.

How often is the feed updated?

An automated agent sweeps every 2 hours and publishes a fresh build to the site. Items are sorted by ingest time so the newest releases always float to the top.

Is AI/TLDR free?

Yes — the site is free to read with no signup. There is an optional newsletter and a Buy-Me-a-Coffee tip jar if you want to support it.

Where does the data come from?

Every item is fetched and verified from a primary source — vendor blog post, GitHub release, arXiv paper, official announcement. Nothing is hallucinated; if a URL or claim cannot be verified, the item is dropped.

How do you decide what's worth covering?

We catch the hype: frontier-lab releases, hyped open-source drops, multi-outlet stories, pricing or capability shifts. Items are tagged seismic, major or notable based on impact.

Can I subscribe to a newsletter?

Yes — there is a daily digest delivered via Buttondown. Subscribe from the homepage banner.

Learn AI from zero

New to LLMs, RAG or agents? Our free Learn AI encyclopedia explains every concept, tool and framework in plain English — 652 articles and counting.

LLM Fundamentals Prompt Engineering Working with LLM APIs Embeddings & Vector Databases Retrieval-Augmented Generation (RAG)AI Agents Agent SDKs & Frameworks AI Coding & Developer Tools Fine-Tuning & Model Customization Local & Open Models Multimodal AI Production & LLMOps Evaluation & Safety Building AI Apps

Compare AI models

Our LLM registry tracks 251 large language models — frontier and open-weight — with verified specs, benchmarks, pricing and APIs, one detail page each.

Anthropic OpenAI Google Meta DeepSeek Alibaba (Qwen)Moonshot AI (Kimi)Z.ai (Zhipu / GLM)xAI (Grok)Mistral AI Cohere MiniMax