"To see a World in a Grain of Sand, and a Heaven in a Wild Flower"
demo.mp4- July 26, 2025: 👋 We present the technical report of HunyuanWorld-1.0, please check out the details and spark some discussion!
- July 26, 2025: 🤗 We release the first open-source, simulation-capable, immersive 3D world generation model, HunyuanWorld-1.0!
Join our Wechat and Discord group to discuss and find help from us.
Creating immersive and playable 3D worlds from texts or images remains a fundamental challenge in computer vision and graphics. Existing world generation approaches typically fall into two categories: video-based methods that offer rich diversity but lack 3D consistency and rendering efficiency, and 3D-based methods that provide geometric consistency but struggle with limited training data and memory-inefficient representations. To address these limitations, we present HunyuanWorld 1.0, a novel framework that combines the best of both sides for generating immersive, explorable, and interactive 3D worlds from text and image conditions. Our approach features three key advantages: 1) 360° immersive experiences via panoramic world proxies; 2) mesh export capabilities for seamless compatibility with existing computer graphics pipelines; 3) disentangled object representations for augmented interactivity. The core of our framework is a semantically layered 3D mesh representation that leverages panoramic images as 360° world proxies for semantic-aware world decomposition and reconstruction, enabling the generation of diverse 3D worlds. Extensive experiments demonstrate that our method achieves state-of-the-art performance in generating coherent, explorable, and interactive 3D worlds while enabling versatile applications in virtual reality, physical simulation, game development, and interactive content creation.
Tencent HunyuanWorld-1.0's generation architecture integrates panoramic proxy generation, semantic layering, and hierarchical 3D reconstruction to achieve high-quality scene-scale 360° 3D world generation, supporting both text and image inputs.
We have evaluated HunyuanWorld 1.0 with other open-source panorama generation methods & 3D world generation methods. The numerical results indicate that HunyuanWorld 1.0 surpasses baselines in visual quality and geometric consistency.
Text-to-panorama generation
| Diffusion360 | 69.5 | 7.5 | 1.8 | 20.9 |
| MVDiffusion | 47.9 | 7.1 | 2.4 | 21.5 |
| PanFusion | 56.6 | 7.6 | 2.2 | 21.0 |
| LayerPano3D | 49.6 | 6.5 | 3.7 | 21.5 |
| HunyuanWorld 1.0 | 40.8 | 5.8 | 4.4 | 24.3 |
Image-to-panorama generation
| Diffusion360 | 71.4 | 7.8 | 1.9 | 73.9 |
| MVDiffusion | 47.7 | 7.0 | 2.7 | 80.8 |
| HunyuanWorld 1.0 | 45.2 | 5.8 | 4.3 | 85.1 |
Text-to-world generation
| Director3D | 49.8 | 7.5 | 3.2 | 23.5 |
| LayerPano3D | 35.3 | 4.8 | 3.9 | 22.0 |
| HunyuanWorld 1.0 | 34.6 | 4.3 | 4.2 | 24.0 |
Image-to-world generation
| WonderJourney | 51.8 | 7.3 | 3.2 | 81.5 |
| DimensionX | 45.2 | 6.3 | 3.5 | 83.3 |
| HunyuanWorld 1.0 | 36.2 | 4.6 | 3.9 | 84.5 |
The open-source version of HY World 1.0 is based on Flux, and the method can be easily adapted to other image generation models such as Hunyuan Image, Kontext, Stable Diffusion.
| HunyuanWorld-PanoDiT-Text | Text to Panorama Model | 2025-07-26 | 478MB | Download |
| HunyuanWorld-PanoDiT-Image | Image to Panorama Model | 2025-07-26 | 478MB | Download |
| HunyuanWorld-PanoInpaint-Scene | PanoInpaint Model for scene | 2025-07-26 | 478MB | Download |
| HunyuanWorld-PanoInpaint-Sky | PanoInpaint Model for sky | 2025-07-26 | 120MB | Download |
You may follow the next steps to use Hunyuan3D World 1.0 via:
We test our model with Python 3.10 and PyTorch 2.5.0+cu124.
For Image to World generation, you can use the following code:
For Text to World generation, you can use the following code:
We provide more examples in examples, you can simply run this to have a quick start:
We provide a ModelViewer tool to enable quick visualization of your own generated 3D WORLD in the Web browser.
Just open modelviewer.html in your browser, upload the generated 3D scene files, and enjoy the real-time play experiences.
Due to hardware limitations, certain scenes may fail to load.
- Inference Code
- Model Checkpoints
- Technical Report
- TensorRT Version
- RGBD Video Diffusion
We would like to thank the contributors to the Stable Diffusion, FLUX, diffusers, HuggingFace, Real-ESRGAN, ZIM, GroundingDINO, MoGe, Worldsheet, WorldGen repositories, for their open research.
.png)







