Nebius · 2026-05-01 · major
Nebius Acquires Eigen AI for $643M — Bringing MIT HAN Lab Inference Stack to Token Factory
Amsterdam-based Nebius is buying inference-optimization startup Eigen AI for $643M in cash and shares to fold custom CUDA/Triton kernels, weight compression, and KV-cache optimizations into its managed Token Factory platform.
Nebius pays $643M for the MIT HAN Lab team behind AWQ and SpAtten — bolting their inference stack onto Token Factory.
Key specs
| Deal value usd | $643M |
|---|---|
| Cash component usd | $98M |
| Stock component shares | 3.8M Nebius Class A |
| Expected close | weeks |
What is it?
Nebius Group, the Amsterdam-listed AI cloud spun out of Yandex, is acquiring Eigen AI, a year-old US inference-optimization startup founded by MIT HAN Lab alumni Ryan Hanrui Wang and Wei-Chen Wang together with MIT CSAIL PhD Di Jin. The all-in price is roughly $643M, paid as ~$98M cash plus 3.8M Nebius Class A shares.
How does it work?
Eigen AI's stack rewrites the default kernels inside open models like Llama, Qwen, and DeepSeek with hand-tuned CUDA and Triton implementations, compresses weights with the team's own AWQ-lineage quantization, and reworks the KV cache that drives long-context throughput. It also offers LoRA-style post-training so customers can fine-tune without recomputing the base model. Nebius will fold all of that into Token Factory, its managed inference service that already exposes autoscaling endpoints for ~12 open-source models.
Why does it matter?
Inference economics are now the bottleneck — frontier-grade serving has become more competitive than the model races themselves. Nebius is buying a team whose research (SpAtten, AWQ, CGPO) underpins much of today's production inference, and is using them to anchor a new Bay Area engineering hub. Expect lower per-token prices and faster latencies on Token Factory's open-model menu, and more pressure on Together, Fireworks, and Groq.
Who is it for?
AI infra teams shopping for managed inference; finance watchers tracking the inference layer.
Try it
https://nebius.com/services/token-factory