AI/TLDR

Tencent · 2026-04-15 · major

HY-World 2.0 — Tencent Hunyuan Open-Sources Multi-Modal 3D World Model With 1,770 HF Upvotes

HY-World 2.0 accepts text, single images, multi-view images, or video and produces navigable 3D environments via panoramic generation, trajectory planning, stereo expansion, and scene assembly—competitive with proprietary systems.

HY-World 2.0 GitHub repository: multi-modal world model for 3D worlds

Text or a single photo → a navigable 3D world you can walk through

Key specs

GitHub stars1,767
Upvotes1,770

What is it?

A multi-modal world model that reconstructs, generates, and simulates 3D environments from diverse inputs. The new WorldLens rendering platform enables interactive exploration with character integration.

How does it work?

A four-stage pipeline: enhanced panoramic scene generation → navigation trajectory planning → stereo expansion → final scene assembly. Each stage is upgraded from HY-World 1.0 with improved 3D scene comprehension.

Why does it matter?

1,770 upvotes on Hugging Face—the top paper of the April 2026 cycle. Fully open-sourced models and code compete with closed proprietary 3D world systems.

Try it

https://github.com/Tencent-Hunyuan/HY-World-2.0

Sources · 3 outlets

Tags

  • 3d
  • world-model
  • multimodal
  • generation
  • open-source
  • paper

← All releases · Learn AI