Can you roll out a 3.0bpw quantization model?

by xldistance - opened Mar 20

Discussion

xldistance

Mar 20

•

edited Mar 20

My video card only has 48GB of video memory

gghfez

Mar 23

Not sure if you can even fit a 3.0bpw quant of this in 48GB of vram but gghfez/c4ai-command-a-03-2025-exl2-3bpw
You probably have to use llama.cpp and offload some layers to the CPU

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment