Nebius · 2026-05-01 · major

Nebius Acquires Eigen AI for $643M — Bringing MIT HAN Lab Inference Stack to Token Factory

Item: Nebius Acquires Eigen AI for $643M — Bringing MIT HAN Lab Inference Stack to Token Factory
Rating: 4
Author: AI/TLDR

Amsterdam-based Nebius is buying inference-optimization startup Eigen AI for $643M in cash and shares to fold custom CUDA/Triton kernels, weight compression, and KV-cache optimizations into its managed Token Factory platform.

Nebius newsroom share card announcing the acquisition of Eigen AI to strengthen Token Factory.

Nebius pays $643M for the MIT HAN Lab team behind AWQ and SpAtten — bolting their inference stack onto Token Factory.

Key specs

Deal value usd	$643M
Cash component usd	$98M
Stock component shares	3.8M Nebius Class A
Expected close	weeks

What is it?

Nebius Group, the Amsterdam-listed AI cloud spun out of Yandex, is acquiring Eigen AI, a year-old US inference-optimization startup founded by MIT HAN Lab alumni Ryan Hanrui Wang and Wei-Chen Wang together with MIT CSAIL PhD Di Jin. The all-in price is roughly $643M, paid as ~$98M cash plus 3.8M Nebius Class A shares.

How does it work?

Eigen AI's stack rewrites the default kernels inside open models like Llama, Qwen, and DeepSeek with hand-tuned CUDA and Triton implementations, compresses weights with the team's own AWQ-lineage quantization, and reworks the KV cache that drives long-context throughput. It also offers LoRA-style post-training so customers can fine-tune without recomputing the base model. Nebius will fold all of that into Token Factory, its managed inference service that already exposes autoscaling endpoints for ~12 open-source models.

Why does it matter?

Inference economics are now the bottleneck — frontier-grade serving has become more competitive than the model races themselves. Nebius is buying a team whose research (SpAtten, AWQ, CGPO) underpins much of today's production inference, and is using them to anchor a new Bay Area engineering hub. Expect lower per-token prices and faster latencies on Token Factory's open-model menu, and more pressure on Together, Fireworks, and Groq.

Who is it for?

AI infra teams shopping for managed inference; finance watchers tracking the inference layer.

Try it

https://nebius.com/services/token-factory