Spaces:
Running
on
Zero
Further Improvements
Loving the LoRA gallery!
Just wanted to give a heads up if you were interested, but I just released a new space that shows how you can improve the inference times even further :)
Thanks!
I might try out your quantization later tonight, when I have more time (and once my ZeroGPU quota restarts).
Per your own experience/testing of the accelerated merge thus far, have the bnb nf4 edit outputs (and/or additional LoRA compatibilities) been reasonably consistent with the bf16 ones?
When I get around to testing the quant for this gallery, I might also try routing in one of zer0int's fine-tuned LongCLIPs as one of the text encoders, as I already have in some of my other Flux LoRA galleries. (Thus far, for regular t2i Flux/fine-tunes, I've had best results with zer0int/LongCLIP-GmP-ViT-L-14. This time though I might try out the newer zer0int/LongCLIP-Registers-Gated_MLP-ViT-L-14 ).
I must admit, I cannot speak from my own limited experience yet with the Kontext models, but if it is like the other FLUX models, then NF4 models should not have too much of a discernable difference. However, I have no data or conviction to confidently take a position yet, as I am just beginning to experiment with Kontext myself.
Regarding the LongCLIP encoders, I too am curious so I'll definitely be keeping an eye out! The only personal experience I have with them are in the context of ComfyUI, never on HF / diffuser style implementations so I'll be looking forward to your implementation.
Warm regards