Tencent Hunyuan · 2026-04-16 · major
HY-World 2.0 — Tencent Open-Sources 3D World Model: Image or Text to Navigable 3D Scene
Tencent open-sources HY-World 2.0: turns text, images, or video into editable 3D scenes as meshes and 3D Gaussian Splattings. Compatible with Blender, Unreal Engine, Unity, and Isaac Sim. 1.5K GitHub stars.
Tencent's open 3D world model turns a single image or text prompt into a fully navigable, editable 3D scene — real geometry, not video.
Key specs
| GitHub stars | 1.5K |
|---|---|
| Output formats | Mesh, 3DGS, point cloud |
What is it?
HY-World 2.0 is a multi-modal world model from Tencent Hunyuan. Unlike video world models that output pixel sequences, it produces real 3D assets — meshes, 3D Gaussian Splattings, and point clouds — that import into Blender, Unreal Engine, Unity, or NVIDIA Isaac Sim. The WorldMirror 2.0 sub-model reconstructs 3D scenes from multi-view images or casual video in a single forward pass.
How does it work?
The pipeline runs four stages: HY-Pano 2.0 generates a panorama from the input; WorldNav plans a navigation trajectory; WorldStereo 2.0 expands the scene with stereo depth estimation; WorldMirror 2.0 fuses everything into 3DGS or mesh output. WorldMirror simultaneously predicts depth maps, surface normals, camera parameters, and 3DGS attributes in one pass, supporting 50K–500K pixel flexible-resolution inference. WorldMirror 2.0 weights and inference code are open now; full generation pipeline weights are coming soon.
Why does it matter?
Video world models output non-editable pixel streams. HY-World 2.0 produces persistent, game-engine-compatible assets. For game developers this means generating level prototypes from text or reference images directly. For robotics and simulation, navigable 3D worlds from casual video speed up digital twin creation.
Who is it for?
Game developers, robotics and simulation researchers, VFX and 3D artists, digital twin builders.
Try it
git clone https://github.com/Tencent-Hunyuan/HY-World-2.0 && pip install -r requirements.txt