ByteDance-Seed/Seed-OSS-36B-Instruct
SeedOssForCausalLM
is unfortinately not currently supported by llama.cpp. Please follow https://github.com/ggml-org/llama.cpp/issues/15483 and let us know once it is supported.
By the way, this is supported now (issue is closed).
@mradermacher
Please update to the latest version of llama.cpp of ouer fork and then on nico1 remove the override for Seed-OSS-36B-Instruct
. I already provided the GGUF.
✅
llama_model_quantize: failed to quantize: unknown model architecture: 'seed_oss'
something went wrong with the cuda build, i'll investigate
indeed, the cuda build fails, not sure why. maybe a bug in 12.6
/usr/include/x86_64-linux-gnu/bits/mathcalls.h(79): error: exception
specification is incompatible with that of previous function "cospi"
(declared at line 2595 of
/usr/local/cuda-12.6/bin/../targets/x86_64-linux/include/crt/math_functions.h)
extern double cospi (double __x) noexcept (true); extern double __cospi (double __x) noexcept (true);
yup, seems cuda 13 (or at least something newer than 12.6) is required. sigh, i have no time for this.
I was able to convert it to GGUF and quantize using a CUDA 12.6 machine. (For the base noSyn model)
quantize does not need or use cuda, afaik, the problem is that we also do llama-imatrix, where we rely on cuda.