Zilliz · 2025-06-06 · notable

Claude Context — Semantic Codebase Search MCP with ~40% Token Savings for Coding Agents

Item: Claude Context — Semantic Codebase Search MCP with ~40% Token Savings for Coding Agents
Rating: 3
Author: AI/TLDR

Zilliz's MCP that gives Claude Code and 14 other coding agents semantic search over the full codebase. Hybrid BM25 + vector retrieval, AST-level chunking, ~40% token savings. #1 AI trending on GitHub today, 873 new stars.

zilliztech/claude-context GitHub repository — semantic code search MCP for Claude Code and other AI coding agents

One MCP install gives your coding agent semantic search over the entire codebase — no per-conversation file loading needed.

Key specs

GitHub stars	7,129
Token reduction	~40%
Stars today	873 (#1 AI trending)
Supported agents	14+
Backend	Milvus / Zilliz Cloud

What is it?

Claude Context is an MCP server from Zilliz (the company behind the Milvus vector database) that indexes a codebase into a vector store and exposes a search tool to any MCP-compatible coding agent. Instead of loading files manually or relying on the agent to guess what exists, the agent queries semantically — 'find all places where auth tokens are handled' — and gets back exact relevant code chunks across millions of lines. It is the #1 AI trending repository on GitHub today with 873 new stars, and supports 14 coding agents including Claude Code, Cursor, Windsurf, Cline, Roo Code, OpenAI Codex CLI, Gemini CLI, and Qwen Code.

How does it work?

The indexer runs an AST-based chunker that splits code into semantically meaningful units (functions, classes, methods) rather than arbitrary line-count windows, then embeds each chunk using one of four provider backends (OpenAI, VoyageAI, Ollama, Gemini). Incremental re-indexing uses Merkle trees so only changed files are re-embedded on subsequent runs. Retrieval combines BM25 keyword matching with dense vector search and fuses the two result sets, covering both exact identifier searches and fuzzy 'find things related to X' queries. Experiments comparing the same agent tasks with and without the MCP show roughly 40% fewer total tokens consumed when the agent uses targeted retrieval instead of loading whole directories.

Why does it matter?

Token usage in agentic coding sessions compounds fast: a naive agent loads every file it might need into context, burning tokens on irrelevant code. A 40% reduction at Claude Opus prices is meaningful cost savings on any non-trivial codebase. For large repos (hundreds of thousands of lines), targeted retrieval is the difference between an agent that can navigate the whole system and one that has to be guided file-by-file. The broad multi-agent support also means teams can adopt it once and get the same benefit across whichever coding tool they use.

Who is it for?

Developers and teams using Claude Code, Cursor, or any MCP-compatible coding agent on codebases with more than a few thousand lines.

Try it

claude mcp add claude-context -e OPENAI_API_KEY=<key> -e MILVUS_ADDRESS=<endpoint> -- npx @zilliz/claude-context-mcp@latest