Update README.md
Browse files
README.md
CHANGED
@@ -25,10 +25,18 @@ Keywords: Video Inpainting, Video Editing, Video Generation
|
|
25 |
|
26 |
|
27 |
<p align="center">
|
28 |
-
<a href='https://yxbian23.github.io/project/video-painter'><img src='https://img.shields.io/badge/Project-Page-Green'></a>
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
</p>
|
30 |
|
31 |
-
**Your
|
|
|
|
|
32 |
|
33 |
|
34 |
**π Table of Contents**
|
@@ -36,10 +44,15 @@ Keywords: Video Inpainting, Video Editing, Video Generation
|
|
36 |
|
37 |
- [VideoPainter](#videopainter)
|
38 |
- [π₯ Update Log](#-update-log)
|
39 |
-
- [TODO](#todo)
|
40 |
- [π οΈ Method Overview](#οΈ-method-overview)
|
41 |
- [π Getting Started](#-getting-started)
|
|
|
|
|
42 |
- [ππΌ Running Scripts](#-running-scripts)
|
|
|
|
|
|
|
43 |
- [π€πΌ Cite Us](#-cite-us)
|
44 |
- [π Acknowledgement](#-acknowledgement)
|
45 |
|
@@ -48,11 +61,13 @@ Keywords: Video Inpainting, Video Editing, Video Generation
|
|
48 |
## π₯ Update Log
|
49 |
- [2025/3/09] π’ π’ [VideoPainter](https://huggingface.co/TencentARC/VideoPainter) are released, an efficient, any-length video inpainting & editing framework with plug-and-play context control.
|
50 |
- [2025/3/09] π’ π’ [VPData](https://huggingface.co/datasets/TencentARC/VPData) and [VPBench](https://huggingface.co/datasets/TencentARC/VPBench) are released, the largest video inpainting dataset with precise segmentation masks and dense video captions (>390K clips).
|
|
|
|
|
51 |
|
52 |
## TODO
|
53 |
|
54 |
- [x] Release trainig and inference code
|
55 |
-
- [x] Release
|
56 |
- [x] Release [VideoPainter checkpoints](https://huggingface.co/TencentARC/VideoPainter) (based on CogVideoX-5B)
|
57 |
- [x] Release [VPData and VPBench](https://huggingface.co/collections/TencentARC/videopainter-67cc49c6146a48a2ba93d159) for large-scale training and evaluation.
|
58 |
- [x] Release gradio demo
|
@@ -107,10 +122,7 @@ pip install -e .
|
|
107 |
</details>
|
108 |
|
109 |
<details>
|
110 |
-
<summary><b>
|
111 |
-
|
112 |
-
|
113 |
-
**VPBench and VPData**
|
114 |
|
115 |
You can download the VPBench [here](https://huggingface.co/datasets/TencentARC/VPBench), and the VPData [here](https://huggingface.co/datasets/TencentARC/VPData) (as well as the Davis we re-processed), which are used for training and testing the BrushNet. By downloading the data, you are agreeing to the terms and conditions of the license. The data structure should be like:
|
116 |
|
@@ -172,11 +184,16 @@ You can download the VPData (only mask and text annotations due to the space lim
|
|
172 |
git lfs install
|
173 |
git clone https://huggingface.co/datasets/TencentARC/VPData
|
174 |
mv VPBench data
|
175 |
-
|
176 |
-
unzip
|
|
|
|
|
|
|
|
|
|
|
177 |
```
|
178 |
|
179 |
-
Noted: *Due to the space limit, you need to run the following script to download the raw videos of the
|
180 |
|
181 |
```
|
182 |
cd data_utils
|
@@ -216,6 +233,13 @@ git clone https://huggingface.co/black-forest-labs/FLUX.1-Fill-dev
|
|
216 |
mv ckpt/FLUX.1-Fill-dev ckpt/flux_inp
|
217 |
```
|
218 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
219 |
|
220 |
The ckpt structure should be like:
|
221 |
|
@@ -237,6 +261,7 @@ The ckpt structure should be like:
|
|
237 |
|-- transformer
|
238 |
|-- vae
|
239 |
|-- ...
|
|
|
240 |
```
|
241 |
</details>
|
242 |
|
|
|
25 |
|
26 |
|
27 |
<p align="center">
|
28 |
+
<a href='https://yxbian23.github.io/project/video-painter'><img src='https://img.shields.io/badge/Project-Page-Green'></a>
|
29 |
+
<a href="https://arxiv.org/abs/2503.05639"><img src="https://img.shields.io/badge/arXiv-2503.05639-b31b1b.svg"></a>
|
30 |
+
<a href="https://github.com/TencentARC/VideoPainter"><img src="https://img.shields.io/badge/GitHub-Code-black?logo=github"></a>
|
31 |
+
<a href="https://youtu.be/HYzNfsD3A0s"><img src="https://img.shields.io/badge/YouTube-Video-red?logo=youtube"></a>
|
32 |
+
<a href='https://huggingface.co/datasets/TencentARC/VPData'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dataset-blue'></a>
|
33 |
+
<a href='https://huggingface.co/datasets/TencentARC/VPBench'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Benchmark-blue'></a>
|
34 |
+
<a href="https://huggingface.co/TencentARC/VideoPainter"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue"></a>
|
35 |
</p>
|
36 |
|
37 |
+
**Your star means a lot for us to develop this project!** βββ
|
38 |
+
|
39 |
+
**VPData and VPBench have been fully uploaded (contain 390K mask sequences and video captions). Welcome to use our biggest video segmentation dataset VPData with video captions!** π₯π₯π₯
|
40 |
|
41 |
|
42 |
**π Table of Contents**
|
|
|
44 |
|
45 |
- [VideoPainter](#videopainter)
|
46 |
- [π₯ Update Log](#-update-log)
|
47 |
+
- [π TODO](#todo)
|
48 |
- [π οΈ Method Overview](#οΈ-method-overview)
|
49 |
- [π Getting Started](#-getting-started)
|
50 |
+
- [Environment Requirement π](#environment-requirement-)
|
51 |
+
- [Data Download β¬οΈ](#data-download-οΈ)
|
52 |
- [ππΌ Running Scripts](#-running-scripts)
|
53 |
+
- [Training π€―](#training-)
|
54 |
+
- [Inference π](#inference-)
|
55 |
+
- [Evaluation π](#evaluation-)
|
56 |
- [π€πΌ Cite Us](#-cite-us)
|
57 |
- [π Acknowledgement](#-acknowledgement)
|
58 |
|
|
|
61 |
## π₯ Update Log
|
62 |
- [2025/3/09] π’ π’ [VideoPainter](https://huggingface.co/TencentARC/VideoPainter) are released, an efficient, any-length video inpainting & editing framework with plug-and-play context control.
|
63 |
- [2025/3/09] π’ π’ [VPData](https://huggingface.co/datasets/TencentARC/VPData) and [VPBench](https://huggingface.co/datasets/TencentARC/VPBench) are released, the largest video inpainting dataset with precise segmentation masks and dense video captions (>390K clips).
|
64 |
+
- [2025/3/25] π’ π’ The 390K+ high-quality video segmentation masks of [VPData](https://huggingface.co/datasets/TencentARC/VPData) have been fully released.
|
65 |
+
- [2025/3/25] π’ π’ The raw videos of videovo subset have been uploaded to [VPData](https://huggingface.co/datasets/TencentARC/VPData), to solve the raw video link expiration issue.
|
66 |
|
67 |
## TODO
|
68 |
|
69 |
- [x] Release trainig and inference code
|
70 |
+
- [x] Release evaluation code
|
71 |
- [x] Release [VideoPainter checkpoints](https://huggingface.co/TencentARC/VideoPainter) (based on CogVideoX-5B)
|
72 |
- [x] Release [VPData and VPBench](https://huggingface.co/collections/TencentARC/videopainter-67cc49c6146a48a2ba93d159) for large-scale training and evaluation.
|
73 |
- [x] Release gradio demo
|
|
|
122 |
</details>
|
123 |
|
124 |
<details>
|
125 |
+
<summary><b>VPBench and VPData Download β¬οΈ</b></summary>
|
|
|
|
|
|
|
126 |
|
127 |
You can download the VPBench [here](https://huggingface.co/datasets/TencentARC/VPBench), and the VPData [here](https://huggingface.co/datasets/TencentARC/VPData) (as well as the Davis we re-processed), which are used for training and testing the BrushNet. By downloading the data, you are agreeing to the terms and conditions of the license. The data structure should be like:
|
128 |
|
|
|
184 |
git lfs install
|
185 |
git clone https://huggingface.co/datasets/TencentARC/VPData
|
186 |
mv VPBench data
|
187 |
+
|
188 |
+
# 1. unzip the masks in VPData
|
189 |
+
python data_utils/unzip_folder.py --source_dir ./data/videovo_masks --target_dir ./data/video_inpainting/videovo
|
190 |
+
python data_utils/unzip_folder.py --source_dir ./data/pexels_masks --target_dir ./data/video_inpainting/pexels
|
191 |
+
|
192 |
+
# 2. unzip the raw videos in Videovo subset in VPData
|
193 |
+
python data_utils/unzip_folder.py --source_dir ./data/videovo_raw_videos --target_dir ./data/videovo/raw_video
|
194 |
```
|
195 |
|
196 |
+
Noted: *Due to the space limit, you need to run the following script to download the raw videos of the Pexels subset in VPData. The format should be consistent with VPData/VPBench above (After download the VPData/VPBench, the script will automatically place the raw videos of VPData into the corresponding dataset directories that have been created by VPBench).*
|
197 |
|
198 |
```
|
199 |
cd data_utils
|
|
|
233 |
mv ckpt/FLUX.1-Fill-dev ckpt/flux_inp
|
234 |
```
|
235 |
|
236 |
+
[Optional]You need to download [SAM2](https://huggingface.co/facebook/sam2-hiera-large) for video segmentation in gradio demo:
|
237 |
+
```
|
238 |
+
git lfs install
|
239 |
+
cd ckpt
|
240 |
+
wget https://huggingface.co/facebook/sam2-hiera-large/resolve/main/sam2_hiera_large.pt
|
241 |
+
```
|
242 |
+
You can also choose the segmentation checkpoints of other sizes to balance efficiency and performance, such as [SAM2-Tiny](https://huggingface.co/facebook/sam2-hiera-tiny).
|
243 |
|
244 |
The ckpt structure should be like:
|
245 |
|
|
|
261 |
|-- transformer
|
262 |
|-- vae
|
263 |
|-- ...
|
264 |
+
|-- sam2_hiera_large.pt
|
265 |
```
|
266 |
</details>
|
267 |
|