Research 2026-03-27

3D 场景生成
开源复现计划
3D Scene Generation
Open-Source Reproduction

从 WorldMesh 出发,梳理当前最优秀的开源 3D 场景生成项目。
文本/图像输入 → 可导航的 3D 世界。源码已本地保存,待 GPU 后逐一复现。
From WorldMesh to the best open-source 3D scene generation projects.
Text/Image → Navigable 3D Worlds. All source code saved locally, ready for GPU.

7
开源项目Open-Source Projects
5
顶会论文Top-Venue Papers
11K+
GitHub Stars

起因:WorldMesh Origin: WorldMesh

WorldMesh 提出了几何优先的 3D 场景生成思路,但代码未开源。我们找到了 7 个可复现的替代方案。 The geometry-first approach that sparked this research. Code not yet released — here are 7 reproducible alternatives.

WorldMesh

arXiv 2603.22972 · TUM · 2026-03-24 · Manuel-Andreas Schneider, Angela Dai

核心思路:几何优先(Geometry-First)— 文本 → 平面图 → 3D 网格支架(墙面、地面、结构)→ 基于网格条件的图像扩散合成外观 → 3D Gaussian Splatting 输出可导航场景。 支持大规模多房间生成,古罗马到赛博朋克多种风格。用户偏好测试 96.2% 优于基线。 Core Idea: Geometry-First — Text → Floor Plan → 3D Mesh Scaffold (walls, floors, structure) → Mesh-Conditioned Image Diffusion for appearance → 3D Gaussian Splatting for navigable output. Supports large-scale multi-room generation across styles (Ancient Roman to Cyberpunk). 96.2% user preference over baselines.

代码未开源Code Not Released arXiv 预印本arXiv Preprint Mesh-Conditioned Diffusion 3D Gaussian Splatting

第一梯队 — 直接可跑 Tier 1 — Ready to Run

代码完整发布,社区验证,文档齐全 Complete code, community-verified, well-documented

#1
Tier 1

WorldGen

1,592 Stars Apache-2.0 2025-04 活跃维护Actively Maintained

文本/图像 → 可导航 3D 场景。基于全景生成 + Gaussian Splatting,pip 安装即用。 支持低显存模式(10GB),是目前上手门槛最低的方案。 Text/Image → navigable 3D scene. Panorama generation + Gaussian Splatting, pip install ready. Low VRAM mode (10GB) makes it the lowest barrier option available.

Text / Image Panorama (FLUX.1) Depth Estimation 3D Gaussian Splat Viser 导航Viser Navigation
#2
Tier 1

Text2Room

1,082 Stars ICCV 2023 TUM

该领域的奠基之作。文本 → 带纹理的 3D 房间网格。迭代生成视图、修复、对齐深度、融合网格。 代码干净稳定,复现性经过大量验证。 The foundational work. Text → textured 3D room mesh. Iterative view generation, inpainting, depth alignment, and mesh fusion. Clean, stable codebase with proven reproducibility.

Text Prompt Stable Diffusion 2 Depth Alignment Mesh Fusion Textured 3D Mesh
#3
Tier 1

Infinigen

6,878 Stars CVPR 2023 + 2024 Princeton BSD

普林斯顿出品,程序化生成照片级室内外 3D 场景。100% 程序化,无需外部素材。 最成熟的项目(3,214 commits),Mac CPU 也能跑。适合生成训练数据集。 Princeton's procedural generation framework. Photorealistic indoor + outdoor 3D scenes, 100% procedural, no external assets. Most mature project (3,214 commits). CPU compatible — works on Mac without NVIDIA GPU.

Procedural Rules Blender Generation Photorealistic Render
#4
Tier 1

WonderWorld

717 Stars CVPR 2025

最接近 WorldMesh 的方案。单张图像 → 连通的可导航 3D 场景,基于 Fast Layered Gaussian Surfels (FLAGS)。 支持浏览器交互式导航,每个新视角 <10 秒。 Closest to WorldMesh. Single image → connected navigable 3D scenes via Fast Layered Gaussian Surfels (FLAGS). Browser-based interactive navigation, <10s per new view.

Single Image Depth + Segmentation Layered Gaussian Surfels 交互式导航Interactive Navigation

第二梯队 — 门槛稍高 Tier 2 — Higher Barrier

代码可用,构建流程较复杂或硬件要求更高 Code available, more complex setup or higher hardware requirements

#5
Tier 2

LayerPano3D

315 Stars SIGGRAPH 2025

文本 → 分层 360° 全景 3D 场景 + Gaussian Splatting。沉浸感最强。 需编译 C++ 扩展(Ceres solver、360monodepth),构建过程较复杂。 Text → layered 360° panoramic 3D scene + Gaussian Splatting. Most immersive experience. Requires C++ extension compilation (Ceres solver, 360monodepth).

#6
Tier 2

RealmDreamer

297 Stars 3DV 2025

文本 → 3D 场景(Gaussian Splatting),用户偏好测试 88-95%。 提供预生成输出可跳过耗时阶段,Stage 2 训练约需数小时。 Text → 3D scene via Gaussian Splatting. 88-95% user preference in studies. Pre-generated outputs available to skip slow stages. Stage 2 training takes several hours.

#7
Tier 2

SceneCraft

233 Stars NeurIPS 2024

文本 + 空间布局 → 多房间公寓 3D 场景(NeRF)。支持复杂平面图。 多房间能力最接近 WorldMesh,但需要 2+ GPU。 Text + spatial layout → multi-room apartment 3D scenes via NeRF. Supports complex floor plans. Closest to WorldMesh's multi-room capability, but requires 2+ GPUs.

快速选择指南 Quick Decision Guide

根据你的场景和硬件选择最合适的项目 Pick the right project based on your scenario and hardware

最快出效果Fastest Results

pip 安装,最低 10GB 显存,文本直接出 3D 场景pip install, 10GB VRAM minimum, text directly to 3D scene

WorldGen
🎓

学习经典方法Learn the Fundamentals

ICCV 2023 奠基之作,代码干净,论文被引最多ICCV 2023 foundational work, clean code, most cited

Text2Room
💻

没有 NVIDIA GPUNo NVIDIA GPU

CPU 也能跑,基于 Blender 程序化生成,Mac 可用CPU compatible, Blender-based procedural generation, works on Mac

Infinigen
🌐

最接近 WorldMeshClosest to WorldMesh

交互式导航,连通多房间,但需要 48GB 显存Interactive navigation, connected multi-room, needs 48GB VRAM

WonderWorld
🎬

360° 沉浸全景360° Immersive Panorama

SIGGRAPH 级别质量,360 度环绕 3D 场景SIGGRAPH-level quality, 360-degree surround 3D scenes

LayerPano3D
🏢

多房间公寓布局Multi-Room Apartments

支持平面图控制,最精确的空间布局方案Floor plan controlled, most precise spatial layout

SceneCraft

本地源码状态 Local Source Code Status

所有源码已 clone 至 repos/ 目录,待 GPU 后逐一复现 All repos cloned to repos/ — ready for GPU

项目Project 本地路径Local Path Stars 会议Venue GPU 状态Status
WorldGen repos/WorldGen/ 1,592 Independent 10-24GB 已保存Saved
Text2Room repos/text2room/ 1,082 ICCV 2023 16-24GB 已保存Saved
Infinigen repos/infinigen/ 6,878 CVPR 23+24 CPU OK 已保存Saved
WonderWorld repos/WonderWorld/ 717 CVPR 2025 48GB 已保存Saved
LayerPano3D repos/LayerPano3D/ 315 SIGGRAPH 25 16-24GB 已保存Saved
RealmDreamer repos/realmdreamer/ 297 3DV 2025 CUDA 11.8 已保存Saved
SceneCraft repos/SceneCraft/ 233 NeurIPS 24 2+ GPU 已保存Saved

技术演进 Evolution Timeline

3D 场景生成领域近年关键节点 Key milestones in 3D scene generation

2023-06
Text2Room (ICCV 2023)
首次实现文本到带纹理 3D 房间网格,奠定该方向基础First text-to-textured 3D room mesh, foundational work
2023-06
Infinigen (CVPR 2023)
普林斯顿程序化生成框架,无限照片级 3D 世界Princeton's procedural generation framework for infinite photorealistic worlds
2023-11
MVDiffusion (NeurIPS 2023 Spotlight)
多视角一致性图像生成,成为后续全景方案的基础组件Multi-view consistent image generation, foundation for panorama-based methods
2024-01
Infinigen Indoors (CVPR 2024)
扩展至室内场景,程序化生成家具、厨房、浴室等Extended to indoor scenes — furniture, kitchens, bathrooms
2024-09
SceneCraft (NeurIPS 2024)
布局引导的多房间 3D 场景生成,NeRF 输出Layout-guided multi-room 3D scene generation via NeRF
2024-09
DreamScene360 (ECCV 2024)
360° 全景 Gaussian Splatting 场景生成360° panoramic Gaussian Splatting scene generation
2025-01
WonderWorld (CVPR 2025)
单图像交互式导航 3D 场景,Fast Layered Gaussian SurfelsSingle-image interactive navigable 3D scenes, Fast Layered Gaussian Surfels
2025-04
WorldGen
目前最易用的文本到 3D 场景方案,10GB 显存即可Most accessible text-to-3D scene method, 10GB VRAM minimum
2025-07
LayerPano3D (SIGGRAPH 2025)
分层 360° 全景 + Gaussian Splatting,超沉浸体验Layered 360° panorama + Gaussian Splatting, hyper-immersive
2026-03
WorldMesh (arXiv)
几何优先策略,网格支架约束多房间生成,代码待发布Geometry-first mesh scaffold for multi-room generation — code pending

参考资源 References

进一步学习和追踪 Further reading and tracking

Awesome 3D Scene Generation

959 Stars,最全面的论文+代码列表959 Stars, the most comprehensive paper + code list

GitHub →

WorldMesh Paper

arXiv 2603.22972,等代码释出后第一时间复现arXiv 2603.22972 — watch for code release

arXiv →

WorldMesh GitHub

Watch 仓库,代码发布后会有通知Watch this repo for code release notification

GitHub →