AI/TLDR — every new AI model, tool, repo & paper
The latest AI releases, refreshed every 2 hours and explained in plain English.
What AI shipped today?
In the last 24 hours AI/TLDR tracked 6 new AI releases, including AI Explained: 'Fable 5 vs GPT 5.6 Sol — The Early Results', Kimi K2.7-Code in GitHub Copilot — first open-weight model in the picker and xAI Voice Agent Builder — no-code builder for production voice agents. AI/TLDR is an AI release tracker that follows new AI models, open-source tools, papers, datasets and benchmarks — refreshed every 2 hours from verified primary sources and explained in plain English.
AI Release Index — live stats on AI releases · Learn AI
- AI Explained: 'Fable 5 vs GPT 5.6 Sol — The Early Results'
AI Explained publishes an early hands-on comparison of Claude Fable 5 and OpenAI's GPT-5.6 Sol days after Anthropic's global redeploy and OpenAI's limited preview.
- Kimi K2.7-Code in GitHub Copilot — first open-weight model in the picker
GitHub Copilot adds Moonshot AI's Kimi K2.7-Code to its model picker — the first open-weight option. It rolls out to Copilot Pro, Pro+, and Max today, with Business and Enterprise following in the coming weeks.
- xAI Voice Agent Builder — no-code builder for production voice agents
xAI's Voice Agent Builder is a no-code platform for production voice agents on Grok Voice. It costs $0.05/min for agent audio plus $0.01/min for telephony on provisioned numbers, with sub-second latency and 25+ languages.
- ZCode — Z.ai's official coding harness for GLM-5.2
ZCode is Z.ai's first-party desktop coding agent for GLM-5.2. It ships Goals for long-running tasks, remote control from WeChat, Feishu, or Telegram, and native installers for macOS, Windows, and Linux.
- Claude in Microsoft Foundry — Opus 4.8 and Haiku 4.5 go GA on Azure
Claude Opus 4.8 and Claude Haiku 4.5 are now generally available in Microsoft Foundry on Azure, billed through the customer's Microsoft Enterprise Agreement and running on NVIDIA GB300 Blackwell Ultra GPUs.
- GeneBench-Pro — OpenAI's 129-problem computational-biology benchmark
GeneBench-Pro is a 129-problem benchmark from OpenAI that grades AI agents on messy, judgment-heavy computational biology. GPT-5.6 Sol Pro tops it at 31.5%; Claude Opus 4.8 lands second at 16.0%.
- DeepSeek V4 gets peak-hour pricing — API doubles 9am–12pm and 2pm–6pm Beijing time
DeepSeek notified users on June 29 that when V4 ships mid-July, API rates for V4 Pro and V4 Flash will double during two Beijing peak windows (9–12 and 14–18) to relieve compute congestion. Off-peak rates hold at the May reductions.
- Two Minute Papers: 'This New AI Model Changes Everything'
Two Minute Papers walks through GLM-5.2, Z.ai's open-weight coding model with a 1M-token context window, framing it as the first open model to plausibly close the gap on frontier closed labs.
- Claude Fable 5 redeployed — Anthropic ships globally after US lifts export controls
Claude Fable 5 becomes available worldwide on July 1 after the US removes export controls issued in June. Anthropic adds a defense-in-depth stack that blocks the jailbreak that got the model paused in over 99% of cases.
- Wes Roth: 'FABLE 5 IS BACK' — reacting to the Claude Fable 5 redeployment
Wes Roth walks through Anthropic's June 30 announcement that US export controls on Claude Fable 5 are lifted and the model returns worldwide on July 1, 2026.
- Leanstral 1.5 — Mistral's updated Lean 4 formal-proof model
Leanstral 1.5 updates Mistral's Lean 4 theorem-proving model with a 119B-parameter MoE, 6.5B active weights, and a 256K context window. Free to try in Mistral Labs playground under model id labs-leanstral-1-5.
- TabFM — Google's zero-shot foundation model for tabular data
TabFM is a Google Research foundation model that classifies and predicts on tabular data in a single forward pass, no per-task training. Weights are on Hugging Face, Apache-2.0 code is on GitHub, BigQuery hook next.
- Gemini Omni Flash + Nano Banana 2 Lite — Google's new video and image models
Google launches Gemini Omni Flash, a $0.10/sec video model with conversational editing, alongside Nano Banana 2 Lite, an image model that ships a result in 4 seconds at $0.034 each.
- 1littlecoder: 'Claude Sonnet 5 in 12 mins!'
1littlecoder publishes a 12-minute walkthrough of Claude Sonnet 5, Anthropic's new agentic Sonnet that approaches Opus 4.8 quality at Sonnet pricing.
- Claude Sonnet 5 — Anthropic's new agentic Sonnet at Opus-class quality
Claude Sonnet 5 is Anthropic's most agentic Sonnet yet, with a 1M-token context and adaptive thinking. It targets Opus 4.8 quality at lower cost and is now the default for Free and Pro plans.
- Claude Code is steganographically marking requests — hidden prompt fingerprints
Researcher Thereallo found that Claude Code silently rewrites its system prompt with steganographic markers when ANTHROPIC_BASE_URL is set, encoding proxy hostnames against an XOR-obfuscated competitor list.
- Claude Science — Anthropic's AI workbench for life-sciences research
Claude Science is an Anthropic desktop app that gives life-sciences researchers one workbench for code, 60+ scientific databases, native protein and genome rendering, and a reviewer agent that catches citation and calculation errors.
- Sam Witteveen: 'Introducing the Gemini Omni Flash API'
Sam Witteveen walks through the Gemini Omni Flash API — Google DeepMind's multimodal video-generation model, now reachable from code as Google opens its developer rollout.
- Agents-A1 — Shanghai AI Lab 35B MoE matches trillion-parameter agents
Agents-A1 is an open-weight 35B Mixture-of-Experts agent from Shanghai AI Laboratory. It posts SOTA scores on SEAL-0 (56.4), FrontierScience-Research (40.0), and IFBench (80.6), and the paper claims parity with trillion-parameter agents.
- LongCat-2.0 — Meituan's 1.6T open-source MoE for agentic coding
Meituan released LongCat-2.0, a 1.6T-parameter open-source mixture-of-experts model with ~48B active per token. It scores 59.5 on SWE-bench Pro and 70.8 on Terminal-Bench 2.1, trained entirely on Chinese AI ASICs under MIT license.
- Quesma: 'Qwen3.6 27B is the sweet spot for local development'
Piotr Migdał argues Qwen3.6 27B is the local-dev sweet spot: ~32 tok/s on a MacBook M5 Max with 8-bit llama.cpp, fits in 42GB RAM, and reaches roughly mid-2025 frontier quality. The post hit 875 points on the Hacker News front page.
- Simon Willison: Ornith-1.0 — hands-on with the open-weights coding model
Simon Willison runs DeepReinforce's new Ornith-1.0 — an MIT-licensed coding model family (9B, 31B, 35B MoE, 397B MoE) built on Gemma 4 and Qwen 3.5 — through LM Studio and a Pi agent loop on a Datasette codebase.
- Cursor for iOS — native mobile app for cloud and remote coding agents
Cursor for iOS is now in public beta on the App Store, letting paid users launch cloud agents, remote-control their desktop Cursor, and merge pull requests from a phone.
- Brain2Qwerty v2 — Meta non-invasive brain-to-text hits 61% word accuracy
Brain2Qwerty v2 decodes typed sentences from MEG brain recordings at 61% word accuracy, up from 8% for prior non-invasive methods. Meta trained the model on 22,000 sentences from 9 volunteers, with code and the paper now public.
- Cline 4.0 — SDK rewrite rolled back to 3.89 two days after launch
Cline 4.0.0 migrated the VS Code extension to a shared SDK session layer with a Plugins marketplace, ClinePass billing, queued chat, and edit-and-regenerate — but launch-day regressions led to a 4.0.1 rollback to the 3.89.2 codebase two days later.
- Weave Router — drop-in proxy that picks the right LLM per request
Weave Router is an open-source proxy for Claude Code, Codex, and Cursor that scores each prompt with an on-box ONNX embedder and routes it to the best model across Anthropic, OpenAI, Gemini, and OpenRouter providers in under 50ms.
- cognee v1.2.2 — truth-subspace reranking for the open-source agent memory platform
cognee v1.2.2 adds truth-subspace reranking: an opt-in retrieval layer that builds centroids from distilled session lessons so the open-source agent memory platform reorders search hits toward what its graph has already learned.
- CVE-2026-LGTM — Andrew Nesbitt's satirical AI supply-chain incident report
Andrew Nesbitt's satirical post-mortem walks a fake malicious npm package past seven AI security gates that each fail for a different reason, dramatizing correlated LLM blind spots and prompt-injection in automated code review.
- GPT-5.6 rollout delayed — US government will vet every customer
OpenAI postponed the broad GPT-5.6 launch at the Trump administration's request, limiting initial access to about 20 government-vetted partners. The Office of the National Cyber Director will approve customers one by one.
- Wes Roth: 'HERMES AGENT + Stripe Payments + NVIDIA Nemotron is INSANE!'
Wes Roth covers three fresh AI ship-events at once: Nous Research's Hermes Agent, the new Stripe Payments integration for agents, and the latest NVIDIA Nemotron release — and explains how they fit together for builders.
- Simon Willison: '2,000 people tried to hack my AI assistant'
Simon Willison covers Fernando Irarrázaval's HackMyClaw challenge, where 2,000 participants sent 6,000 email-based prompt injection attempts at a Claude Opus 4.6 assistant. The $1,000 bounty went unclaimed — no one extracted the protected secret.
- Codex Remote GA — control desktop Codex from ChatGPT mobile
Codex Remote is generally available on all ChatGPT plans. Mobile users start or continue Codex work on a paired Mac or Windows host, with one-to-one QR pairing and a new DigitalOcean Droplet workspace plugin for ad-hoc cloud boxes.
- GPT-4.5 retired from ChatGPT — end of the GPT-4 era in the app
OpenAI removed GPT-4.5 from ChatGPT on June 26, 2026, including from custom GPTs. Existing GPT-4.5 conversations continue on GPT-5.5. The change applies only to ChatGPT — the OpenAI API still serves every GPT-4 model unchanged.
- DSpark + DeepSpec — DeepSeek opens its speculative decoding stack
DeepSeek released DeepSpec, an MIT-licensed codebase to train and evaluate draft models for speculative decoding, plus DSpark speculative-decoding modules attached to its V4-Pro and V4-Flash checkpoints on Hugging Face.
- Anthropic Economic Index: Cadences — Claude usage hour by hour
Anthropic's June 2026 Economic Index report, called Cadences, samples Claude usage by the hour. Personal chats rise from ~35% on weekdays to ~50% on weekends, sleep questions peak around 5 a.m., recipes around 6 p.m.
- Claude Mythos 5 restored — US Commerce lifts block for 100+ trusted partners
Claude Mythos 5 is back for 100+ pre-approved US institutions after the US Commerce Department lifted its two-week export block. Anthropic can now ship Mythos 5 to the Annex A trusted-partner list without a license.
- GPT-5.6 — OpenAI previews Sol, Terra, and Luna tiers
OpenAI announced GPT-5.6 with three named tiers — Sol (flagship), Terra (balanced), and Luna (cheap and fast) — adding new max and ultra reasoning modes. Access starts as a limited preview for trusted partners.
- 1littlecoder: 'GPT 5.6 — What, Availability, Pricing'
1littlecoder walks through OpenAI's same-day GPT-5.6 announcement — the Sol, Terra, and Luna tiers, the new max and ultra reasoning modes, the published per-million-token prices, and who actually gets access during the limited preview.
- Ornith 1.0 — open-weight coding models that learn their own RL scaffold
Ornith 1.0 is an MIT-licensed family of agentic coding LLMs (9B, 31B, 35B MoE, 397B MoE) whose RL loop writes its own task-specific scaffold instead of using a fixed human-designed harness.
- Sam Witteveen: 'Introducing Ornith 1.0' — open-weight coding LLM walkthrough
Sam Witteveen walks through Ornith 1.0, DeepReinforce's MIT-licensed coding model family whose RL loop learns its own scaffold — uploaded hours after the 9B–397B weights landed on Hugging Face.