Can we use existed llama 3.1 gguf
Is there a gguf loader for llama 3.1 in comyui yet?
Hey there, i have found that node pack includes a GGUF Quadruple Clip Loader node: https://github.com/calcuis/gguf
Yes, added yesterday to the main node pack already: https://github.com/city96/ComfyUI-GGUF/commit/47bec6147569a138dd30ad3e14f190a36a3be456
Copying from github, these models should work:
- clip_g_hidream.safetensors from
Comfy-Org/HiDream-I1_ComfyUI
- clip_l_hidream.safetensors from
Comfy-Org/HiDream-I1_ComfyUI
- Any T5 GGUF from city96/t5-v1_1-xxl-encoder-gguf
- Any LLaMA 8B GGUF from bartowski/Meta-Llama-3.1-8B-Instruct-GGUF (I think this is the right one?)
Anyone have a demo workflow that uses ggufs? Not sure why I'm having some much trouble!
Is there a quantized version of the VAE somewhere? Or does it degrade badly to use a quantized VAE?
The VAE is 330MBs, there's no point in ever quantizing that, since it won't reduce the amount of runtime memory required for decoding (which takes way more than the actual weights).
Seems to be working for me, had to update my ComfyUI/custom_nodes/ComfyUI-GGUF
to pickup the new GGUF Quadruple Clip Loader
.
Seems to be working with mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF Q6_K
as well as city96's other GGUFs linked above. EDIT Also tested working with DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF
Still experimenting to find best settings and how each part effects output, but here is the basic workflow from https://docs.comfy.org/tutorials/advanced/hidream modified to use the new GGUF loaders.
(hopefully it preserves the workflow metadata).
Basically just replace two nodes:
Load Diffusion Model
->Unet Loader (GGUF)
QuadrupleCLIPLoader
->QuadrupleCLIPLoader (GGUF)
Good luck!