YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

EXL3

Custom exl3 quantization, with 5bpw attention layers, 4bpw for the MLP layers, and an 8bpw lm_head.

Fits ~64K context in 24GB quite well.

Original Model Card:

Prompt format

<|im_start|>system
You are a helpful assistant for generating thinking process.
You are generating thinking process based on the question and existing answer.<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
{answer}

<think>
Downloads last month
4
Safetensors
Model size
9.36B params
Tensor type
F16
·
I16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support