AI/TLDR — New AI Releases Daily: Models, Tools, Repos & PapersA high-volume feed of new AI releases — models, open-source repos, developer tools, papers, datasets, and benchmarks — refreshed every 2 hours. Each release is explained in plain English so you actually understand what shipped.This site uses JavaScript to render the interactive feed. Enable JavaScript, or visit the source repo for the raw JSON.

AI/TLDR

AI/TLDR — every new AI model, tool, repo & paper

The latest AI releases, refreshed every 2 hours and explained in plain English.

What AI shipped today?

In the last 24 hours AI/TLDR tracked 14 new AI releases, including David Rosenthal: 'AI's Affordability Crisis' — the 70x subsidy that can't hold, Latent Space: 'Red-Teaming after Mythos' — Gray Swan on AI security and Claude Tag — Anthropic's @Claude Slack agent for shared teamwork. AI/TLDR is an AI release tracker that follows new AI models, open-source tools, papers, datasets and benchmarks — refreshed every 2 hours from verified primary sources and explained in plain English.

AI Release Index — live stats on AI releases · Learn AI

David Rosenthal: 'AI's Affordability Crisis' — the 70x subsidy that can't holdDavid Rosenthal · 2026-06-23 · article
David Rosenthal pulls together SemiAnalysis and Ed Zitron numbers to argue AI tokens are sold at a fraction of cost — Anthropic up to 40x, OpenAI up to 70x — and that real billing would turn a $200 ChatGPT plan into a $14,000 bill.
Latent Space: 'Red-Teaming after Mythos' — Gray Swan on AI securityLatent Space · 2026-06-22 · article
Latent Space hosts Zico Kolter (OpenAI board, CMU) and Matt Fredrikson (Gray Swan CEO) to argue AI security is not 'cybersecurity with AI' — Gray Swan's Shade red-teaming model now beats human attackers at breaking frontier LLMs.
Claude Tag — Anthropic's @Claude Slack agent for shared teamworkAnthropic · 2026-06-23 · tool
Claude Tag is a Slack agent from Anthropic that anyone in a channel summons by tagging @Claude. The shared bot breaks a request into stages, runs the work with tools an admin scoped per channel, and posts results back to the thread.
Fireship: 'Midjourney wants to delete 30% of all death…'Fireship · 2026-06-23 · video
Fireship reacts to Midjourney Medical's pitch — a 60-second full-body ultrasonic CT scan planned for a San Francisco spa — and the company's bold claim that AI-driven early diagnostics could prevent a large share of premature deaths.
Simon Willison: 'Prompt Injection as Role Confusion'Simon Willison · 2026-06-22 · article
Simon Willison highlights a new paper by Charles Ye, Jasmine Cui, and Dylan Hadfield-Menell arguing prompt injection is really 'role confusion' — language models lean on style cues, not content, to tell trusted text from user input.
Armin Ronacher: 'The Coming Loop' — why even skeptics end up loopingArmin Ronacher · 2026-06-23 · article
Armin Ronacher argues 'harness loops' — outer systems that re-run AI agents past their natural stopping point — work well for code porting and benchmark runs, but breed defensive, dependency-creating code when pointed at real codebases.
Mistral OCR 4 — 170-language document model with bounding boxes and confidence scoresMistral AI · 2026-06-23 · model
Mistral OCR 4 extracts text plus per-block bounding boxes, type labels, and confidence scores across 170 languages, scoring 85.20 on OlmOCRBench and 93.07 on OmniDocBench at $4 per 1,000 pages.
Wes Roth: 'Cursor JUST beat EVERYONE…'Wes Roth · 2026-06-23 · video
Wes Roth's new video argues Cursor has pulled ahead of rival AI coding agents, walking through the Cursor Compile 26 opening keynote and the Composer 2.5 in-house coding model.
Baidu Unlimited-OCR — 3B vision model parses long documents in one passBaidu · 2026-06-22 · model
Baidu's Unlimited-OCR is a 3B vision-language model that introduces Reference Sliding Window Attention to keep a constant KV cache, letting one forward pass transcribe dozens of document pages within a 32K context. Code and weights ship under MIT.
PP-OCRv6 — PaddlePaddle ships 50-language OCR family from 1.5M to 34.5M paramsPaddlePaddle · 2026-06-22 · model
PP-OCRv6 is the next PaddleOCR family with Tiny (1.5M), Small (7.7M), and Medium (34.5M) tiers covering 50 languages. The Medium tier lifts detection Hmean to 86.2% and recognition accuracy to 83.2%, gains of 4.6 and 5.1 points over PP-OCRv5_server.
Oak — version control built for AI coding agentsOak · 2026-06-22 · tool
Oak is a new version control system for AI coding agents that mounts repos lazily, runs branch-per-task, and benchmarks up to 95% faster than Git on snapshots, large binaries, and dirty trees.
Anthropic-Cybersecurity-Skills v1.3.0 — 817 security skills across 6 frameworksMahipal Jangra · 2026-06-22 · repo
Mahipal Jangra's open Anthropic-Cybersecurity-Skills library jumps from 762 to 817 agent skills in v1.3.0, adding AI Security, Supply Chain, and Hardware/Firmware domains plus MITRE F3 as a sixth framework mapping.
Simon Willison — porting Moebius image inpainting to the browser via Claude CodeSimon Willison · 2026-06-22 · article
Simon Willison shows how he used Claude Opus 4.8 to port the 0.22B Moebius image-inpainting model from PyTorch/CUDA to a browser-only WebGPU + ONNX demo, with the agent doing the framework conversion, weight upload, and UI work.
Claude Code 2.1.186 — MCP login CLI plus auto-reply to bash commandsAnthropic · 2026-06-22 · tool
Claude Code v2.1.186 adds claude mcp login/logout for CLI-based MCP server auth, makes ! bash commands auto-prompt Claude to respond, and fixes 20+ background-agent and post-sleep streaming bugs.
OpenAI Codex — SSD-burning SQLite log bug patched after 640 TB/year reportsOpenAI · 2026-06-22 · tool
OpenAI Codex CLI shipped two patches that cut about 85% of its SQLite log writes. Users had measured 37 TB written in 21 days, on track for 640 TB a year and full-drive SSD wear in months.
Two Minute Papers: 'DeepSeek Just Solved AI's Billion Dollar Problem'Two Minute Papers · 2026-06-22 · video
Two Minute Papers walks through the DualPath paper, which attacks the KV-cache I/O bottleneck behind agentic LLM serving costs and reports up to 1.96x higher online throughput.
Sakana Fugu — multi-agent orchestration model that matches Fable 5 on qualitySakana AI · 2026-06-22 · model
Sakana AI launched Fugu and Fugu Ultra, a multi-agent orchestration model delivered as one OpenAI-compatible API. Fugu Ultra coordinates a pool of expert agents and is reported to match Fable 5 on coding, reasoning, science, and agentic benchmarks.
Hermes Agent v0.17.0 — iMessage, WhatsApp, and async subagents from NousNous Research · 2026-06-19 · repo
Hermes Agent v0.17.0 'The Reach Release' adds iMessage support via Photon Spectrum (no Mac relay), an official WhatsApp Business Cloud adapter, Raft agent-network integration, and background subagents that return handles. 1,475 commits, 245 contributors.
Palmier Pro v0.3.5 — open-source macOS video editor for AI agentsPalmier · 2026-06-20 · tool
Palmier Pro v0.3.5 adds transcript-based cutting, ripple-insert trims with linked audio, folder imports, and a Claude Opus 4.8 upgrade to the Swift-native macOS video editor that exposes every timeline action to AI agents over MCP.
Cloudflare Temporary Accounts — AI agents deploy live Workers in seconds, no signupCloudflare · 2026-06-19 · tool
Cloudflare's new wrangler deploy --temporary command lets an AI agent provision a live Workers account in seconds without signup. The account stays usable for 60 minutes, after which a human can claim it permanently or let it auto-expire.
Grok on Databricks — xAI models land in Agent Bricks via SpaceX dealxAI · 2026-06-18 · tool
Grok on Databricks makes Grok 4.3 and Grok Build 0.1 natively callable from Databricks Agent Bricks, so enterprise teams can wire xAI models into governed Lakehouse data without external pipelines.
John Jumper to Anthropic — Nobel laureate AlphaFold creator leaves DeepMindAnthropic · 2026-06-19 · ecosystem
John Jumper, the 2024 Nobel chemistry laureate and AlphaFold co-creator, posted on X that he is leaving Google DeepMind after nearly nine years to join Anthropic. He is the second senior Google AI departure this week after Noam Shazeer.
Nathan Lambert: 'Banning Open Source AI Would Be A Mistake'Interconnects AI · 2026-06-19 · article
Nathan Lambert and Kevin Xu argue banning open-source AI would hurt US security, education, and competition. Their Interconnects post responds to an executive order to review AI models and a block on foreign access to Anthropic models.
Moebius — 0.22B image inpainting matches FLUX.1-Fill-Dev's 11.9BHUST + VIVO AI Lab · 2026-06-18 · paper
Moebius is a 0.22B image inpainting model from HUST and VIVO AI Lab that matches the 11.9B FLUX.1-Fill-Dev across six benchmarks while running over 15x faster, with code and weights now on GitHub.
agent-eval — Hugging Face harness benchmarks coding agents on your own libraryHugging Face · 2026-06-18 · benchmark
Hugging Face shipped agent-eval, an open harness that measures how well coding agents like Kimi-K2.6 and GLM-5.1 use a library — not just task completion, but token cost, time, and error rate across bare, clone, and skill access tiers.
Beyond LoRA — Hugging Face benchmark shows OFT and BEFT can beat the defaultHugging Face · 2026-06-18 · tutorial
Hugging Face benchmarked five PEFT methods on identical tasks and found OFT beats LoRA on image fine-tuning while BEFT and Lily trade memory for accuracy on math. PEFT also gained an adapter-to-LoRA converter for vLLM compatibility.
Cursor 3.8 — /automate skill plus new GitHub and Slack automation triggersCursor · 2026-06-18 · tool
Cursor 3.8 launches Cursor Automations: describe a task in plain English with /automate and it runs on its own when a GitHub event or Slack emoji react fires. Adds five GitHub triggers; cloud agents get computer use by default.
Noam Shazeer to OpenAI — Gemini co-lead becomes Lead for Architecture ResearchOpenAI · 2026-06-17 · ecosystem
Noam Shazeer, Google VP and Gemini co-lead, is leaving Google to join OpenAI as Lead for Architecture Research, two years after Google paid $2.7B to bring him back from Character.AI.
In the Weights — ex-OpenAI tool scores whether AI models remember your nameIn the Weights · 2026-06-18 · showcase
Joey Flynn and Thomas Dimson, both ex-OpenAI, launched a free site that types a name into multiple LLMs in parallel and returns a 0–996 'strength score' for how well models recall the person from training data alone.
DeepMind AI Control Roadmap — defense-in-depth for misaligned AI agentsGoogle DeepMind · 2026-06-18 · article
Google DeepMind publishes the AI Control Roadmap: a defense-in-depth framework that treats internal AI agents as insider threats, with trusted supervisor models monitoring their actions in real time.
Two Minute Papers: 'Scientists Found A Better Language For AI Agents'Two Minute Papers · 2026-06-19 · video
Two Minute Papers covers RecursiveMAS, a UIUC/Stanford/NVIDIA/MIT framework that lets multi-agent systems trade latent thoughts instead of text and reports 2.4x faster inference with 75.6% fewer tokens.
MolmoMotion — Ai2's language-guided 3D motion forecasting modelsAllen Institute for AI · 2026-06-17 · model
Ai2 released MolmoMotion, two open models that predict where points on objects will move in 3D space from a video frame plus a text instruction. The drop bundles a 1.16M-video training set and the PointMotionBench eval, and lifts a robot pick-and-place baseline from 56.0% to 76.3%.
MosaicLeaks — ServiceNow benchmark for research-agent privacy leaksServiceNow Research · 2026-06-18 · benchmark
ServiceNow released MosaicLeaks, a 1,001-chain benchmark that measures how much private context a research agent leaks into its web queries, plus PA-DR, an RL recipe that drops leakage from 51.7% to 9.9% on Qwen3-4B with no loss of task success.
Agentic Resource Discovery — HF, Microsoft, Google open spec for tool lookupHugging Face · 2026-06-17 · ecosystem
Hugging Face, Microsoft, Google, and GoDaddy launched Agentic Resource Discovery, an open spec that lets agents find MCP servers, skills, and A2A endpoints at runtime. ARD defines an ai-catalog.json manifest and a POST /search registry API, with a reference Discover Tool on HF.
Sam Witteveen: VibeThinker 3B — taking on giant modelsSam Witteveen · 2026-06-19 · video
Sam Witteveen reviews Weibo's VibeThinker-3B, the new 3B-parameter reasoning model that scores 80.2% on LiveCodeBench v6. Witteveen walks through how a 3B open-weights model competes with much larger frontier models on code and math reasoning.
MCP Enterprise-Managed Authorization — zero-touch OAuth for Claude, VS Code, LinearModel Context Protocol · 2026-06-18 · ecosystem
Enterprise-Managed Authorization, an MCP extension, is now stable. Admins provision MCP servers once through Okta and users get every connector on first login with no per-app OAuth. Claude, VS Code, Linear, Figma, Asana, Atlassian and Supabase ship support; Ramp is live with 2,000 users.
OpenAI Codex Record & Replay — demonstrate a macOS workflow, get a reusable skillOpenAI · 2026-06-18 · tool
OpenAI Codex App 26.616 adds Record & Replay on macOS. Run a workflow once and Codex saves it as an editable, reusable skill that replays through Computer Use, browser tools, and plugins. Requires Computer Use; not in the EEA, UK, or Switzerland at launch.
Simon Willison: GLM-5.2 is probably the most powerful text-only open weights LLMSimon Willison · 2026-06-17 · article
Simon Willison calls Z.ai's GLM-5.2 today's strongest open-weights text LLM: top of Artificial Analysis Intelligence Index v4.1 at 51, second on Code Arena WebDev behind Claude Fable 5, and ~$1.40/$4.40 per 1M tokens on OpenRouter vs GPT-5.5's $5/$30.
Wes Roth: Google's 'POST AGI' paper — DeepMind's AGI-to-ASI roadmapWes Roth · 2026-06-18 · video
Wes Roth breaks down 'From AGI to ASI', a Google DeepMind paper co-signed by Shane Legg, Marcus Hutter and Thore Graepel that maps four routes from human-level AI to superintelligence.
LifeSciBench — OpenAI's 750-task benchmark for life-science researchOpenAI · 2026-06-17 · benchmark
OpenAI's LifeSciBench grades AI models on 750 expert-authored life-science tasks using rubrics. The strongest model, GPT-Rosalind, passes only 36.1%, with attached data files cited as the main bottleneck.

Frequently asked questions

What is AI/TLDR?

AI/TLDR is a high-volume tracker of new AI releases — models, open-source repos, developer tools, papers, datasets, benchmarks and security findings — refreshed every 2 hours and explained in plain English.

How often is the feed updated?

An automated agent sweeps every 2 hours and publishes a fresh build to the site. Items are sorted by ingest time so the newest releases always float to the top.

Is AI/TLDR free?

Yes — the site is free to read with no signup. There is an optional newsletter and a Buy-Me-a-Coffee tip jar if you want to support it.

Where does the data come from?

Every item is fetched and verified from a primary source — vendor blog post, GitHub release, arXiv paper, official announcement. Nothing is hallucinated; if a URL or claim cannot be verified, the item is dropped.

How do you decide what's worth covering?

We catch the hype: frontier-lab releases, hyped open-source drops, multi-outlet stories, pricing or capability shifts. Items are tagged seismic, major or notable based on impact.

Can I subscribe to a newsletter?

Yes — there is a daily digest delivered via Buttondown. Subscribe from the homepage banner.

Learn AI from zero

New to LLMs, RAG or agents? Our free Learn AI encyclopedia explains every concept, tool and framework in plain English — 652 articles and counting.

LLM Fundamentals Prompt Engineering Working with LLM APIs Embeddings & Vector Databases Retrieval-Augmented Generation (RAG)AI Agents Agent SDKs & Frameworks AI Coding & Developer Tools Fine-Tuning & Model Customization Local & Open Models Multimodal AI Production & LLMOps Evaluation & Safety Building AI Apps

Compare AI models

Our LLM registry tracks 245 large language models — frontier and open-weight — with verified specs, benchmarks, pricing and APIs, one detail page each.

Anthropic OpenAI Google Meta DeepSeek Alibaba (Qwen)Moonshot AI (Kimi)Z.ai (Zhipu / GLM)xAI (Grok)Mistral AI Cohere MiniMax