VistaDream is a novel framework for reconstructing 3D scenes from single-view images using Flux-based diffusion models. This implementation combines image outpainting, depth estimation, and 3D Gaussian splatting for high-quality 3D scene generation, with integrated visualization using Rerun.
Uses Rerun for 3D visualization, Gradio for interactive UI, Flux for diffusion-based outpainting, and Pixi for easy installation.
VistaDream addresses the challenge of 3D scene reconstruction from a single image through a novel two-stage pipeline:
- Coarse 3D Scaffold Construction: Creates a global scene structure by outpainting image boundaries and estimating depth maps
- Multi-view Consistency Sampling (MCS): Uses iterative diffusion-based RGB-D inpainting with multi-view consistency constraints to generate high-quality novel views
The framework integrates multiple state-of-the-art models:
- Flux diffusion models for high-quality image outpainting and inpainting
- 3D Gaussian Splatting for efficient 3D scene representation
- Rerun for real-time 3D visualization and debugging
- Linux only with NVIDIA GPU (CUDA 12.8)
- Pixi package manager
This will automatically download the required models and run the example with the included office image.
Generate a complete 3D scene from a single image with outpainting, depth estimation, and Gaussian splatting:
Note: The full 3D reconstruction pipeline is currently under active development. Some features may be experimental or incomplete.
Process a single image with depth estimation and basic 3D reconstruction:
Run just the outpainting component with Rerun visualization:
Launch an interactive web interface for experimenting with the models:
- Single Image to 3D: Complete pipeline from single image to navigable 3D scene
- Memory Efficient: Model offloading support for GPU memory management
- Real-time Visualization: Integrated Rerun viewer for 3D scene inspection
- Training-free: No fine-tuning required for existing diffusion models
- High Quality: Multi-view consistency sampling ensures coherent 3D reconstruction
Models are automatically downloaded from Hugging Face on first run. Manual download:
Expected structure:
Thanks to the original authors! If you use VistaDream in your research, please cite:
This project builds upon several outstanding works:
- Flux - Black Forest Labs for the diffusion model foundation
- 3D Gaussian Splatting - Inria for efficient 3D representation
- Rerun - Rerun.io for 3D visualization framework
- GSplat - Nerfstudio for Gaussian splatting implementation
- MoGe - Microsoft Research for monocular geometry estimation
.png)

![AI algorithms is making all products look the same (2021) [video]](https://www.youtube.com/img/desktop/supported_browsers/opera.png)
