AI/TLDR

Nebius · 2026-05-01 · major

Nebius Acquires Eigen AI for $643M — Bringing MIT HAN Lab Inference Stack to Token Factory

Amsterdam-based Nebius is buying inference-optimization startup Eigen AI for $643M in cash and shares to fold custom CUDA/Triton kernels, weight compression, and KV-cache optimizations into its managed Token Factory platform.

Nebius newsroom share card announcing the acquisition of Eigen AI to strengthen Token Factory.

Nebius pays $643M for the MIT HAN Lab team behind AWQ and SpAtten — bolting their inference stack onto Token Factory.

Key specs

Deal value usd$643M
Cash component usd$98M
Stock component shares3.8M Nebius Class A
Expected closeweeks

What is it?

Nebius Group, the Amsterdam-listed AI cloud spun out of Yandex, is acquiring Eigen AI, a year-old US inference-optimization startup founded by MIT HAN Lab alumni Ryan Hanrui Wang and Wei-Chen Wang together with MIT CSAIL PhD Di Jin. The all-in price is roughly $643M, paid as ~$98M cash plus 3.8M Nebius Class A shares.

How does it work?

Eigen AI's stack rewrites the default kernels inside open models like Llama, Qwen, and DeepSeek with hand-tuned CUDA and Triton implementations, compresses weights with the team's own AWQ-lineage quantization, and reworks the KV cache that drives long-context throughput. It also offers LoRA-style post-training so customers can fine-tune without recomputing the base model. Nebius will fold all of that into Token Factory, its managed inference service that already exposes autoscaling endpoints for ~12 open-source models.

Why does it matter?

Inference economics are now the bottleneck — frontier-grade serving has become more competitive than the model races themselves. Nebius is buying a team whose research (SpAtten, AWQ, CGPO) underpins much of today's production inference, and is using them to anchor a new Bay Area engineering hub. Expect lower per-token prices and faster latencies on Token Factory's open-model menu, and more pressure on Together, Fireworks, and Groq.

Who is it for?

AI infra teams shopping for managed inference; finance watchers tracking the inference layer.

Try it

https://nebius.com/services/token-factory

Sources · 3 outlets

Tags

  • nebius
  • eigen-ai
  • acquisition
  • inference
  • token-factory
  • cuda
  • triton
  • kv-cache
  • lora
  • post-training
  • mit-han-lab
  • open-source-models

← All releases · Learn AI