gryan
/

OmniGen-v1-fp16-bnb-4bit

4-bit precision

Model card Files Files and versions

OmniGen-v1-fp16-bnb-4bit / README.md

gryan's picture

Update README.md

bd47fb0 verified 9 months ago

|

history blame contribute delete

2.63 kB

	---
	license: mit
	base_model:
	- Shitao/OmniGen-v1
	pipeline_tag: text-to-image
	tags:
	- image-to-image
	---


	This repo contains bitsandbytes 4bit-NF4 float16 model weights for [OmniGen-v1](https://huggingface.co/Shitao/OmniGen-v1). These are intended for Google Colab users or those with a GPU that does not support bfloat16. Other 4-bit seekers should prefer the [bf16-bnb-4bit](https://huggingface.co/gryan/OmniGen-v1-bnb-4bit) model as it produces higher quality images. For info about OmniGen see the [original model card](https://huggingface.co/Shitao/OmniGen-v1).


	- 8-bit weights: [gryan/OmniGen-v1-bnb-8bit](https://huggingface.co/gryan/OmniGen-v1-bnb-8bit)
	- 4-bit (bf16, nf4) weights: [gryan/OmniGen-v1-bnb-4bit](https://huggingface.co/gryan/OmniGen-v1-bnb-4bit)


	## Usage
	Set up your environment by following the original [Quick Start Guide](https://huggingface.co/Shitao/OmniGen-v1#5-quick-start) before getting started.

	> [!IMPORTANT]
	> NOTE: This feature is not officially supported yet. You'll need to install the repo from [this pull request](https://github.com/VectorSpaceLab/OmniGen/pull/151).

	```python
	from OmniGen import OmniGenPipeline, OmniGen

	# pass the quantized model in the pipeline
	model = OmniGen.from_pretrained('gryan/OmniGen-v1-fp16-bnb-4bit', dtype=torch.float16)
	pipe = OmniGenPipeline.from_pretrained("Shitao/OmniGen-v1", model=model)

	# proceed as normal!

	## Text to Image
	images = pipe(
	prompt="A curly-haired man in a red shirt is drinking tea.",
	height=1024,
	width=1024,
	guidance_scale=2.5,
	seed=0,
	)
	images[0].save("example_t2i.png") # save output PIL Image

	## Multi-modal to Image
	# In the prompt, we use the placeholder to represent the image. The image placeholder should be in the format of <img><\|image_*\|></img>
	# You can add multiple images in the input_images. Please ensure that each image has its placeholder. For example, for the list input_images [img1_path, img2_path], the prompt needs to have two placeholders: <img><\|image_1\|></img>, <img><\|image_2\|></img>.
	images = pipe(
	prompt="A man in a black shirt is reading a book. The man is the right man in <img><\|image_1\|></img>.",
	input_images=["./imgs/test_cases/two_man.jpg"],
	height=1024,
	width=1024,
	guidance_scale=2.5,
	img_guidance_scale=1.6,
	seed=0
	)
	images[0].save("example_ti2i.png") # save output PIL image
	```

	## Image Samples
	<img src="./assets/text_only_1111_fp16_4bit.png" alt="Text Only FP16 4bit">
	<img src="./assets/single_img_1111_fp16_4bit.png" alt="Single Image FP16 4bit">
	<img src="./assets/double_img_1111_fp16_4bit.png" alt="Double Image FP16 4bit">