salso commited on
Commit
8000271
·
verified ·
1 Parent(s): 1a678c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -137
README.md CHANGED
@@ -1,137 +1,10 @@
1
- # Structure-Preserving Style Transfer using Canny, Depth & Flux
2
-
3
- ## Overview
4
-
5
- This project implements a custom **image-to-image style transfer pipeline** that blends the **style of one image (Image A)** into the **structure of another image (Image B)**.
6
-
7
- We just added canny to [this work by Nathan Shipley](https://gist.github.com/nathanshipley/7a9ac1901adde76feebe58d558026f68), where the fusion of style and structure creates artistic visual outputs. It's an easy edut
8
-
9
- We will release the codes of the version leveraging ZenCtrl architecture.
10
-
11
- ---
12
-
13
- ## �� Key Features
14
-
15
- - �� **Style-Structure Fusion**: Seamlessly transfers style from Image A into the spatial geometry of Image B.
16
- - �� **Model-Driven Pipeline**: No UI dependencies; powered entirely through locally executed Python scripts.
17
- - �� **Modular**: Easily plug in other models or replace components (ControlNet, encoders, etc.).
18
-
19
- ---
20
-
21
- ## �� How It Works
22
-
23
- 1. **Inputs**:
24
- - **Image A**: Style reference
25
- - **Image B**: Structural reference
26
-
27
- 2. **Structural Conditioning**:
28
- - **Canny Edge Map** of Image B
29
- - **Depth Map** via a pre-trained MiDaS model or similar
30
-
31
- 3. **Style Conditioning**:
32
- - Style prompts or embeddings extracted from Image A via a CLIP/T5/BLIP2 encoder
33
-
34
- 4. **Generation Phase**:
35
- - A diffusion model (e.g., Stable Diffusion + ControlNet) is used
36
- - Flux-style injection merges the style and structure via guided conditioning
37
- - Output image retains Image B’s layout but adopts Image A’s artistic features
38
-
39
- ---
40
-
41
- ## �� Project Structure
42
-
43
- ```
44
- ├── models/
45
- │ ├── controlnet/
46
- │ ├── stable-diffusion/
47
- ├── scripts/
48
- │ ├── extract_edges.py # Canny edge map from Image B
49
- │ ├── estimate_depth.py # Generate depth map
50
- │ ├── encode_style.py # Encode Image A (prompt or features)
51
- │ └── generate.py # Full generation pipeline
52
- ├── assets/
53
- │ ├── input_style.jpg
54
- │ ├── input_structure.jpg
55
- │ └── output.jpg
56
- ├── README.md
57
- ```
58
-
59
- ---
60
-
61
- ## �� Quick Start
62
-
63
- 1. **Install dependencies**
64
-
65
- ```bash
66
- pip install -r requirements.txt
67
- ```
68
-
69
- 2. **Run generation**
70
-
71
- ```bash
72
- python scripts/generate.py \
73
- --style_image assets/input_style.jpg \
74
- --structure_image assets/input_structure.jpg \
75
- --output_image assets/output.jpg
76
- ```
77
-
78
- Options:
79
- - `--use_depth true`
80
- - `--use_canny true`
81
- - `--style_prompt "a painting in van Gogh style"`
82
-
83
- ---
84
-
85
- ## �� Example
86
-
87
- | Style (Image A) | Structure (Image B) | Output |
88
- |----------------|---------------------|--------|
89
- | ![](assets/input_style.jpg) | ![](assets/input_structure.jpg) | ![](assets/output.jpg) |
90
-
91
- ---
92
-
93
- ## �� Use Cases
94
-
95
- - AI-powered visual storytelling
96
- - Concept art and virtual scene design
97
- - Artistic remapping of real-world photos
98
- - Ad creative generation
99
-
100
- ---
101
-
102
- ## �� Credits & Inspiration
103
-
104
- - [Nathan Shipley's work](https://gist.github.com/nathanshipley/7a9ac1901adde76feebe58d558026f68) for the idea spark
105
- - Hugging Face models:
106
- - [Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion-v1-4)
107
- - [ControlNet for Canny/Depth](https://huggingface.co/lllyasviel/ControlNet)
108
- - [MiDaS for Depth](https://huggingface.co/Intel/dpt-large)
109
- - Flux-based prompt handling inspired by multimodal conditioning techniques
110
-
111
- ---
112
-
113
- ---
114
-
115
- ## �� Try Also: ZenCtrl – Modular Agentic Image Control
116
-
117
- If you enjoyed this project, you may also like [**ZenCtrl**](https://github.com/FotographerAI/ZenCtrl), our open-source **agentic visual control toolkit** for generative image pipelines that we are developing.
118
-
119
- ZenCtrl can be combined with this style transfer project to introduce additional layers of control, allowing for more refined composition before or after stylization. It’s especially useful when working with structured scenes, human subjects, or product imagery.
120
-
121
- With ZenCtrl, we aim to:
122
- - Chain together preprocessing, control, editing, and postprocessing modules
123
- - Create workflows for tasks like **product photography**, **try-on**, **background swaps**, and **face editing**
124
- - Use control adapters like **canny, depth, pose, segmentation**, and more
125
- - Easily integrate with APIs or run it in a Hugging Face Space
126
-
127
- > Whether you're refining structure by changing the background layout before stylization or editing the results afterward, ZenCtrl gives you full compositional control across the image generation stack.
128
-
129
- 👉 [Explore ZenCtrl on GitHub](https://github.com/FotographerAI/ZenCtrl)
130
- 👉 [Try the ZenCtrl Demo on Hugging Face Spaces](https://huggingface.co/spaces/fotographerai/ZenCtrl)
131
-
132
- ---
133
-
134
-
135
- ## �� Contact
136
-
137
- Want to collaborate or learn more? Reach out via GitHub or drop us a message!
 
1
+ title: Zen Style Shape
2
+ emoji: ⚡
3
+ colorFrom: red
4
+ colorTo: pink
5
+ sdk: gradio
6
+ sdk_version: 5.24.0
7
+ app_file: app.py
8
+ pinned: false
9
+ license: apache-2.0
10
+ short_description: Structure-Preserving Style Transfer with Canny, Depth & Flux