arxiv:2507.13985

DreamScene: 3D Gaussian-based End-to-end Text-to-3D Scene Generation

Published on Jul 18

· Submitted by

jahnsonblack on Jul 31

Upvote

Authors:

Haoran Li ,

Abstract

DreamScene is an end-to-end framework that generates high-quality, editable 3D scenes from text or dialogue, ensuring automation, 3D consistency, and fine-grained control through a combination of scene planning, graph-based placement, formation pattern sampling, and progressive camera sampling.

AI-generated summary

Generating 3D scenes from natural language holds great promise for applications in gaming, film, and design. However, existing methods struggle with automation, 3D consistency, and fine-grained control. We present DreamScene, an end-to-end framework for high-quality and editable 3D scene generation from text or dialogue. DreamScene begins with a scene planning module, where a GPT-4 agent infers object semantics and spatial constraints to construct a hybrid graph. A graph-based placement algorithm then produces a structured, collision-free layout. Based on this layout, Formation Pattern Sampling (FPS) generates object geometry using multi-timestep sampling and reconstructive optimization, enabling fast and realistic synthesis. To ensure global consistent, DreamScene employs a progressive camera sampling strategy tailored to both indoor and outdoor settings. Finally, the system supports fine-grained scene editing, including object movement, appearance changes, and 4D dynamic motion. Experiments demonstrate that DreamScene surpasses prior methods in quality, consistency, and flexibility, offering a practical solution for open-domain 3D content creation. Code and demos are available at https://jahnsonblack.github.io/DreamScene-Full/.

View arXiv page View PDF Project page GitHub 172 Add to collection

Community

jahnsonblack

Paper author Paper submitter 25 days ago

DreamScene is an end-to-end framework for high-quality, consistent, and editable 3D scene generation from text. It is an extended version of our ECCV 2024 paper “DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling.”

librarian-bot

24 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2507.13985 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2507.13985 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2507.13985 in a Space README.md to link it from this page.