New AI GitHub Repos — Trending Open-Source Drops
Trending open-source AI projects fresh off GitHub — new repos, libraries and frameworks, with what each one does and why it's picking up stars.
54 releases tracked
- cuTile Rust v0.2.0 — NVIDIA Labs ships NVFP4 GPU kernels in safe Rust
cuTile Rust is NVIDIA Labs' safe tile-based GPU kernel DSL for Rust, now with NVFP4 packing and a new performance paper.
- agent-skills 0.6.2 — Addy Osmani adds /build and /webperf for AI coding agents
Addy Osmani's open-source agent-skills bundle hits v0.6.2 with new slash commands and hardened security skills.
- OpenCV 5.0 Ships With Built-In LLM and VLM Inference — Rewritten Graph-Based DNN Engine Pushes ONNX Coverage From ~22% to Over 80%, Native Tokenizer and KV-Cache Run Qwen 2.5, Gemma 3, PaliGemma, and GPT-Family Models Out of the Box
The most-installed computer-vision library lands version 5.0 with a graph-based DNN engine and built-in LLM / VLM inference, timed for CVPR 2026 in Denver.
- Nous Research Ships Hermes Agent v0.16.0 'The Surface Release' — Native Electron Desktop App for macOS, Linux, and Windows With In-App Self-Updates, Drag-and-Drop, and a Full Web Admin Panel
Nous Research's self-improving agent gets a native desktop client and a no-config web admin panel; the project crossed 183K stars on the same day.
- Headroom v0.26.0 — context compressor adds Copilot BYOK and Bedrock proxy
Drop-in token compression layer for AI agents that claims 60–95% fewer tokens with single-digit accuracy hit.
- PewDiePie Open-Sources Odysseus — Self-Hosted AI Workspace Hits 20K Stars in 24 Hours With Chat, Agents, MCP, Deep Research, and Email Triage
Felix Kjellberg drops a 20K-star self-hosted AI workspace that runs entirely on your hardware — chat, agents, deep research, and email.
- LlamaIndex Rewrites LiteParse in Rust for v2.0 — 5–100× Faster Open-Source PDF and Office Parser, 457-Page Doc in 0.777 Seconds, Native + Python + Node + WASM
LlamaIndex rewrote its open-source PDF parser in Rust — same CLI, 5–100× faster, runs on Linux, macOS, Windows, Python, Node, and in the browser via WASM.
- Nous Research Ships Hermes Agent v0.15.0 'The Velocity Release' — Multi-Agent Kanban Platform, 76% Smaller Core Loop, 4,500× Faster Session Search
Velocity release rewires Hermes Agent into a multi-agent Kanban swarm and shrinks the core loop by 76%.
- DeepSeek-Reasonix — Terminal Coding Agent Built Around DeepSeek's Prefix Cache Reports a 99.82% Cache Hit Rate, Cutting a 435M-Token Day From ~$61 to ~$12
A terminal coding agent that bets on DeepSeek's prefix cache to keep long sessions cheap.
- Remove-AI-Watermarks Hits the HN Front Page — MIT-Licensed CLI Strips Visible Gemini Watermarks Plus Invisible SynthID, C2PA, and EXIF Provenance
An MIT-licensed library lifts every major AI provenance marker — and lands on the HN front page the same week OpenAI adopts SynthID.
- CodeGraph 0.7.10 — Local SQLite + Tree-Sitter Knowledge Graph Cuts Claude Code, Codex, and Cursor Exploration Tool Calls by 94%
Pre-index a repo into a local knowledge graph and let your coding agent query it instead of grepping every file.
- Forge — Open-Source Guardrails Lift a Self-Hosted 8B Local Model From 53% to 99% on Multi-Step Agentic Tool-Calling Workflows
Drop-in reliability layer that pushes 8B local models toward frontier-API accuracy on tool-calling agents.
- Semble — Code Search for AI Agents Uses ~98% Fewer Tokens Than grep+read, Hits the Hacker News Front Page
A CPU-only code-search MCP server that cuts the tokens an agent burns hunting through a repo.
- InsForge v2.1 — Open-Source Backend Platform for Coding Agents Adds Compute Services, Realtime Presence, Stripe
All-in-one open-source backend (DB, auth, storage, compute, AI gateway) that coding agents drive end-to-end via MCP or CLI.
- ds4 — Antirez Ships C/Metal Inference Engine for DeepSeek V4 Flash on Apple Silicon
Antirez's first AI repo: a DeepSeek V4 Flash-only inference engine in C, built for Apple Silicon with a compressed disk-backed KV cache.
- vLLM v0.20.0 — DeepSeek V4, FlashAttention 4 Default, TurboQuant 2-bit KV Cache
vLLM's biggest release of 2026: DeepSeek V4, FA4 as default, and 4× KV cache via 2-bit compression.
- SGLang v0.5.10 — Native Apple Silicon Backend, Elastic MoE Fault Tolerance, sglang-kernel 0.4.1
SGLang adds Apple Silicon inference, resilient MoE failover, and a 1000× RDMA reduction for large-scale clusters.
- llama.cpp Build b8738 — Vendor-Agnostic Tensor Parallelism, 1-Bit Quantization, AMD CDNA4
llama.cpp gains true multi-GPU tensor parallelism across CUDA, ROCm, and Metal — no vendor lock-in.
- RAGFlow v0.25.0 — Seven Pipeline Templates, Agent Publishing, User Memory, Mobile-Ready
RAGFlow v0.25 ships production-ready agent publishing, persistent user memory, and ingestion pipelines for seven document types.
- PageIndex File System — Vectorless Reasoning RAG Scales to Millions of Documents
PageIndex File System adds a query-time virtual tree that makes vectorless reasoning RAG work across millions of documents.
- Unsloth Studio v0.1.37 — New UI Redesign, Qwen3.6 Support, Preserve Thinking Mode
Unsloth Studio's April redesign adds Qwen3.6 support, Preserve Thinking toggle, and developer role for agentic coding tools.
- Google ADK Python v1.32.0 — Native OpenTelemetry Metrics, Anthropic Thinking Blocks, Visual Graph Canvas
Google's ADK now ships native OpenTelemetry metrics and visual agent graphs in its 2026 multi-agent framework.
- Llama Stack v0.8.0 — Native Anthropic Messages API, Gemini Interactions, 91% OpenAI Compatibility
Llama Stack now speaks Anthropic's Messages API and Google's Interactions API from a single unified GenAI stack.
- Goose v1.2 — Automatic MCP Server Discovery, ACP Protocol, 44k GitHub Stars
Goose adds MCP auto-discovery and ACP agent-to-agent protocol — the reference implementation for the evolving MCP ecosystem.
- ml-sharp-web — Apple's SHARP 3D Gaussian Splat Model Runs Entirely In Your Browser
Apple's one-shot photo-to-3D Gaussian splat model, ported to ONNX and shipped as a 100% in-browser web app.
- DeepClaude — Run Claude Code's Agent Loop on DeepSeek V4 Pro for 17x Cheaper Tokens
Point Claude Code at DeepSeek V4 Pro and pay $0.87 per million output tokens instead of $15 — same agent loop, same tools, different brain.
- DeepSeek-TUI v0.8.7 — Rust Terminal Coding Agent for DeepSeek V4 With 1M-Token Context
A native, single-binary terminal agent built specifically around DeepSeek V4's 1M-token context and prefix cache.
- Ruflo v3.6.10 — 32 Plugins, Agent Federation, IoT Cognitum for Multi-Agent Claude Orchestration
Open-source orchestrator for swarms of Claude Code agents picks up federation and an IoT bridge in a 32-plugin v3.6 release.
- jcode — Rust Coding Agent Harness Hits #4 on GitHub Trending With Swarm Mode and Self-Modification
A solo-developer Rust coding-agent harness tops GitHub trending with claims of 14 ms boot and built-in swarm mode.
- OpenWarp — Community Fork of Warp Opens Up Bring-Your-Own AI Provider
Plug any OpenAI-compatible model into Warp's terminal — credentials stay local, no cloud relay.
- Mike — Open-Source, Self-Hostable Alternative to Harvey and Legora for Law Firms
Self-hostable open-source legal AI: chat with documents, cite verbatim, run multi-step workflows, draft contracts.
- Dirac — Open-Source Coding Agent Tops Terminal-Bench-2 on Gemini 3 Flash Preview
A solo-built Cline fork takes #1 on Terminal-Bench-2 with a Flash-tier model — the harness, not the model, was the bottleneck.
- Beads — Git-Backed Dependency Graph Issue Tracker Built for AI Coding Agents
A dependency-aware issue tracker built for AI coding agents — versioned graph storage with hash IDs that survive multi-agent merges.
- GoModel — Lightweight AI Gateway in Go, LiteLLM Alternative
A Go-based AI gateway that exposes a single OpenAI-compatible endpoint across 10+ LLM providers, with caching, cost tracking, and an admin dashboard.
- Matt Pocock's Claude Code Skills — 20+ Real-World Agent Skills Go Viral
A TypeScript educator's personal collection of Claude Code agent skills — TDD loops, PRD generation, issue triage, and architecture review, open-sourced and trending.
- GitNexus — Code Intelligence Graph with MCP for AI Editors
A code knowledge graph that gives AI editors structural awareness of your entire codebase via 16 MCP tools.
- OpenChronicle — Open-Source Local-First Screen Memory for LLM Agents
Free, local-first screen memory for AI agents — built in 48 hours as a direct response to OpenAI paywalling Chronicle at $100/month.
- Browser Harness: Self-Healing LLM Browser Control Directly on CDP
A minimal Python shim (~592 lines) that connects an LLM directly to Chrome via CDP — and lets the agent edit its own harness when it gets stuck.
- free-claude-code — Route Claude Code CLI to Free LLM Providers
A Python proxy that lets you use the Claude Code CLI and VSCode extension with free LLM backends instead of Anthropic's API.
- vercel-labs/skills — Universal CLI for Discovering and Installing Agent Skills Across 45+ Coding Tools
One CLI to install, share, and discover reusable instruction sets for any AI coding agent.
- RAG-Anything — All-in-One Multi-Modal RAG Framework for Text, Images, Tables, and Equations
RAG for real-world documents — handles images, tables, equations, and charts alongside text in a single pipeline.
- Claude Context — Semantic Codebase Search MCP with ~40% Token Savings for Coding Agents
One MCP install gives your coding agent semantic search over the entire codebase — no per-conversation file loading needed.
- CrabTrap — LLM-as-Judge HTTP Proxy to Secure AI Agents in Production
CrabTrap sits between your AI agent and the internet, vetting every outbound request against natural-language security policies before they leave.
- LuceBox Hub — Hand-Tuned LLM Inference Reaching 207 tok/s on an RTX 3090
LuceBox rewrites LLM inference from scratch for one GPU at a time, achieving 207 tok/s on Qwen3.5-27B on a consumer RTX 3090.
- RuView — WiFi DensePose: 17-Point Body Pose Through Walls, No Camera Required
Real-time body pose and vital sign monitoring through walls using commodity WiFi — no camera, no wearable, $9 hardware.
- OpenCode — open-source terminal AI coding agent crosses 146k GitHub stars
The largest open-source AI coding agent on GitHub — terminal-first, provider-agnostic, and a viable swap for Claude Code if you want fully open infra.
- Mozilla Thunderbolt — Open-Source Self-Hostable Enterprise AI Client
Mozilla's open-source enterprise AI client that runs on your own infrastructure and connects to any model provider.
- Hermes Agent v0.10.0 — Tool Gateway for Self-Growing Multi-Platform AI Agents
Self-growing agent framework that learns from interactions and runs on every platform from CLI to WhatsApp.
- Lyra 2.0 — NVIDIA's Explorable Generative 3D World Framework
NVIDIA's open-source framework for generating interactive 3D worlds — walk through, explore, export to Isaac Sim.
- ClawGUI — Unified Framework for Training, Evaluating, and Deploying GUI Agents
Train, benchmark, and deploy mobile GUI agents — all from one auditable open-source repo.
- Open Agents — Open-Source Cloud Coding Agent Platform by Vercel Labs
Vercel Labs' forkable cloud coding agent: chat drives the task, an isolated sandbox runs it, and Workflow SDK keeps it durable.
- Shep — parallel AI coding agent orchestrator across git worktrees
Run ten feature branches in parallel — each with its own AI agent, git worktree, and PR — from a single CLI command.
- AutoResearch — autonomous ML experiment runner
Give an AI agent a small LLM training setup, go to sleep, and wake up to 100 completed experiments — Karpathy's viral 630-line repo.
- OpenWorldLib — unified world-model codebase
One codebase, one definition, six modules — an attempt to pull scattered world-model research into a single inference framework.