Text-to-Image
Diffusers
English
alimama-creative
stable-diffusion

PosterMaker

demo images

PosterMaker is Accepted by CVPR25, please visit Project page to learn more details.

Model

pomethodster

PosterMaker is an advanced framework for generating promotional product posters with high text rendering and fidelity. Utilizing TextRenderNet for precise character-level text control and SceneGenNet for maintaining product fidelity, PosterMaker excels in creating visually appealing posters. Through a two-stage training strategy to optimize text rendering and background generation separately, PosterMaker outperforms existing methods significantly.

For more technical details, please refer to the Research paper.

Model Weight

Introduce the model names and weights

Model Name Weight Name Download Link
TextRenderNet_v1 textrender_net-0415.pth HuggingFace
SceneGenNet_v1 scenegen_net-0415.pth HuggingFace
SceneGenNet_v1 with Reward Learning scenegen_net-rl-0415.pth HuggingFace
TextRenderNet_v2 textrender_net-1m-0415.pth HuggingFace
SceneGenNet_v2 scenegen_net-1m-0415.pth HuggingFace

NOTE: TextRenderNet_v2 is trained with more data for training in the Stage 1, resulting in better text rendering effects. Related details can be found in Section 8 of the Supplementary Materials.

Known Limitations

The current model exhibits the following known limitations stemming from processing strategies applied to textual elements and captions during constructing our training dataset:

Text

  • During training, we restrict texts to 7 lines of up to 16 characters each, and the same applies during inference.
  • The training data comes from e-commerce platforms, resulting in relatively simple text colors and font styles with limited design diversity. This leads to similarly simple styles in the inference outputs.

Layout

  • Only horizontal text boxes are supported (since the amount of vertical text boxes was insufficient, we excluded them from training data)
  • Text box must maintain aspect ratios proportional to content length for optimal results (derived from tight bounding box annotations in training)
  • No automatic text wrapping within boxes (multi-line text was split into separate boxes during training)

Prompt Behavior

  • Text content should not be specified in prompts (to match the training setting).
  • Limited precise control over text attributes. For poster generation, we expect the model to automatically determine text attributes like fonts and colors. Thus, descriptions about text attributes were intentionally suppressed in training captions.

Citation

If you find PosterMaker useful for your research and applications, please cite using this BibTeX:

@misc{gao2025postermakerhighqualityproductposter,
          title={PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering}, 
          author={Yifan Gao and Zihang Lin and Chuanbin Liu and Min Zhou and Tiezheng Ge and Bo Zheng and Hongtao Xie},
          year={2025},
          eprint={2504.06632},
          archivePrefix={arXiv},
          primaryClass={cs.CV},
          url={https://arxiv.org/abs/2504.06632},
}

LICENSE

The model is based on SD3 finetuning; therefore, the license follows the original SD3 license.

Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for alimama-creative/PosterMaker

Finetuned
(1)
this model