AMAP-ML · 2026-06-15 · notable
DreamX-World 1.0 — Alibaba AMAP open-sources an interactive world model
Alibaba's AMAP research team releases a 5B Apache-2.0 video world model with camera navigation, scene revisit, and event control across photoreal, game-style, and stylized domains. Code, paper, and two checkpoints are out.
Open-source 5B world model that lets you steer the camera, revisit a scene, and stage events across photoreal, game, and stylized worlds.
What is it?
DreamX-World 1.0 is an interactive video world model from Alibaba's AMAP research group. You give it a starting image or prompt plus camera moves and event instructions, and it generates a coherent video the agent or player can navigate inside.
How does it work?
The model uses a progressive training pipeline that adds camera-aware conditioning, geometry-guided memory for scene revisit, structured event instruction tuning, autoregressive long-video generation with distillation, and a reinforcement-learning quality pass. A technique the authors call Efficient PRoPE projects positional encodings to spatially reduced tokens, cutting inference latency by about 30% versus full PRoPE.
Why does it matter?
World models that combine free camera control with memory across revisits and named events have mostly been closed (Project Genie, Decart Oasis). Shipping a 5B Apache-2.0 checkpoint that reaches 16 fps on eight RTX 5090s puts a comparable system in researchers' hands. AMAP reports a 57–62% human-preference win rate over HY-WorldPlay 1.5 and LingBot-World on the same evaluation.
Who is it for?
world-model researchers, robotics simulation teams, game engine experiments
Try it
https://github.com/AMAP-ML/DreamX-WorldKey numbers
- camera control score: 73.75
- overall score: 84.76
- params: 5B
- fps 8x rtx5090: 16
- stars: 325