Eyeline Labs / Netflix · 2026-04-23 · notable

Vista4D: Re-Render Any Video from a New Camera Angle (CVPR 2026 Highlight)

CVPR 2026 Highlight from Eyeline Labs and Netflix. Vista4D scaffolds any monocular video with a 4D point cloud, then re-renders the scene from any camera path using a fine-tuned Wan2.1 14B diffusion transformer. 76 HuggingFace upvotes. Users preferred Vista4D for overall fidelity 77% of the time over the best baseline.

Vista4D teaser — video reshooting from 4D point-cloud scaffold with novel camera trajectories, CVPR 2026

Give any monocular video a 4D point-cloud scaffold, then re-render the scene from any camera angle you choose.

Key specs

GitHub stars	81
User preference (overall fidelity)	77.4%
Hugging face upvotes	76
Conference	CVPR 2026 Highlight

What is it?

Vista4D is a CVPR 2026 Highlight paper from researchers at Eyeline Labs, Netflix, Columbia, UCLA, Stony Brook, and Oxford. It takes a monocular source video and synthesizes the same dynamic scene from entirely different camera trajectories and viewpoints — reshooting footage that was never physically captured from that angle.

How does it work?

The method reconstructs a temporally-persistent 4D point cloud from the source video using static pixel segmentation and depth estimation. A fine-tuned Wan2.1 14B video diffusion transformer is conditioned on both the source video and a point-cloud render of the target view, with in-context latent token concatenation rather than cross-attention. The system is trained on noisy reconstructed multi-view data specifically to be robust to depth estimation artifacts at inference time. Optional extensions support 4D scene recomposition — editing the point cloud to insert or remove subjects — and dynamic scene expansion by incorporating additional camera captures.

Why does it matter?

Prior video reshooting methods fail on real-world footage because depth estimation artifacts cause jitter and visual inconsistency across viewpoints. Vista4D's training approach directly addresses this, producing stable re-renders under large viewpoint changes. Users preferred it over the strongest baseline 77% of the time for overall fidelity. Relevant for VFX teams generating alternate angles from a single camera pass, autonomous driving data augmentation, and video editing tools that need synthetic multi-viewpoint data.

Who is it for?

Computer vision researchers; VFX and video production teams; developers building video editing tools

Try it

Code and weights: https://github.com/Eyeline-Labs/Vista4D