--- base_model: - stabilityai/stable-diffusion-3.5-medium tags: - art license: other license_name: stabilityai-ai-community license_link: LICENSE - --- # Bokeh 3.5 Medium
00205_
Bokeh 3.5 Medium is a **Continue-training** model built upon the **stable diffusion 3.5 medium** foundation, further refined using a **500W high-resolution open-source dataset** with rigorous **aesthetic curation**. This ensures outstanding image quality, fine detail preservation, and enhanced controllability. This model is released under the Stability Community License. For more details, visit [Tensor.Art](https://tensor.art) or [TusiArt](https://tusiart.com) to explore additional resources and useful information. ## Overview - **Continue-training on SD3.5M**, leveraging a large-scale **500W high-resolution dataset**, carefully curated for aesthetic quality. - **Supports hybrid short/long caption training** for enhanced natural language understanding. - **Short Captions:** Focus on core image features. - **Long Captions:** Provide broader scene context and atmospheric details. - **Recommended Resolutions:** `1920x1024`, `1728x1152`, `1152x1728`, `1280x1664`, `1440x1440` - **Best Quality Training Resolution:** `1440x1440` - **Supports LoRA fine-tuning.** ## Advantages ### 🖼️ High-Quality Image Generation - **State-of-the-art visual fidelity** with improved detail extraction and **aesthetic consistency**. - **Enhanced resolution support** up to **200W pixels**, ensuring highly detailed image outputs. - **Carefully curated dataset** ensures better composition, lighting, and overall artistic appeal. ### 🎯 Powerful Custom Fine-Tuning - **Exceptional LoRA training support**, making it highly effective for: - Photography - 3D Rendering - Illustration - Concept Art ### ⚡ Efficient Inference & Training - **Low hardware requirements for inference:** - **Medium model:** 9GB VRAM (without T5) - **Full weights inference:** 16GB VRAM (suitable for local deployment) - **LoRA fine-tuning VRAM requirement:** 12GB - 32GB ## Known Issues - **Potential human anatomy inconsistencies.** - **Limited ability to generate photorealistic images.** - **Some concepts may suffer from aesthetic quality issues.** ## Prompting Guide ### Use a structured prompt combining: - **Main subject** (e.g., `"Close-up of a macaw"`) - **Detailed features** (e.g., `"vivid feathers, sharp beak"`) - **Background environment** (e.g., `"dimly lit environment"`) - **Atmospheric description** (e.g., `"soft warm lighting, cinematic mood"`) ### Best Practices: - **Avoid overly complex prompts**, as the model already has strong text encoding. Overloading details can cause **T5 hallucination artifacts**, reducing image quality. - **Do not use excessively short prompts** (e.g., single words or 2-3 tokens) unless combined with **LoRA or Image2Image (i2i)** techniques. - **Avoid mixing too many unrelated concepts**, as this can lead to visual distortions and unwanted artifacts. - **Optimal token length:** **30-70 tokens**. ### Negative Prompting - **Negative prompts strongly influence image quality.** - Ensure they **do not contradict the main subject** to avoid degrading the output. ## Example Output Using diffusers: ```python import torch from diffusers import StableDiffusion3Pipeline pipe = StableDiffusion3Pipeline.from_pretrained("/mnt/share/pcm_outputs/bokeh_3.5_medium", torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") image = pipe( "Close-up of a macaw, dimly lit environment", num_inference_steps=28, guidance_scale=4, height=1920, width=1024, ).images[0] image.save("macaw.jpg") ``` Using comfyui: To use this workflow in **ComfyUI**, download the JSON file and load it: [Download Workflow](bk_workflow.json) ## Recommended Training Configuration For **LoRA fine-tuning**, the following tools and settings are recommended: ### 🔧 Training Tools - **Kohya_ss:** [GitHub Repository](https://github.com/bmaltais/kohya_ss.git) - **Simple Tuner:** [GitHub Repository](https://github.com/bghira/SimpleTuner) ### ⚙️ Suggested Training Settings ```bash --Resolution 1440x1440 --t5xxl_max_token_length 154 --optimizer_type AdamW8bit --mmdit_lr 1e-4 --text_encoder_lr 5e-5 ``` ## Contact * Website: https://tensor.art https://tusiart.com * Developed by: TensorArt