A.X 4.0 Light

๐Ÿค— Models | ๐Ÿ’ฌ Chat | ๐Ÿ“ฌ APIs (FREE!) | ๐Ÿ–ฅ๏ธ Github

A.X 4.0 Family Highlights

SK Telecom released A.X 4.0 (pronounced "A dot X"), a large language model (LLM) optimized for Korean-language understanding and enterprise deployment, on July 03, 2025. Built on the open-source Qwen2.5 model, A.X 4.0 has been further trained with large-scale Korean datasets to deliver outstanding performance in real-world business environments.

  • Superior Korean Proficiency: Achieved a score of 78.3 on KMMLU, the leading benchmark for Korean-language evaluation and a Korean-specific adaptation of MMLU, outperforming GPT-4o (72.5).
  • Deep Cultural Understanding: Scored 83.5 on CLIcK, a benchmark for Korean cultural and contextual comprehension, surpassing GPT-4o (80.2).
  • Efficient Token Usage: A.X 4.0 uses approximately 33% fewer tokens than GPT-4o for the same Korean input, enabling more cost-effective and efficient processing.
  • Deployment Flexibility: Offered in both a 72B-parameter standard model (A.X 4.0) and a 7B lightweight version (A.X 4.0 Light).
  • Long Context Handling: Supports up to 131,072 tokens, allowing comprehension of lengthy documents and conversations. (Lightweight model supports up to 16,384 tokens length)

Performance

Model Performance

Benchmarks A.X 4.0 Qwen3-235B-A22B
(w/o reasoning)
Qwen2.5-72B GPT-4o
Knowledge KMMLU 78.32 73.64 66.44 72.51
CLIcK 83.51 74.55 72.59 80.22
KoBALT 47.30 41.57 37.00 44.00
MMLU 86.62 87.37 85.70 88.70
General Ko-MT-Bench 86.69 88.00 82.69 88.44
MT-Bench 83.25 86.56 93.50 88.19
LiveBench2024.11 52.30 64.50 54.20 52.19
Instruction Following Ko-IFEval 77.96 77.53 77.07 75.38
IFEval 86.05 85.77 86.54 83.86
Math HRM8K 48.55 54.52 46.37 43.27
MATH 74.28 72.72 77.00 72.38
Code HumanEval+ 79.27 79.27 81.71 86.00
MBPP+ 73.28 70.11 75.66 75.10
LiveCodeBench2024.10~2025.04 26.07 33.09 27.58 29.30
Long Context LongBench<128K 56.70 49.40 45.60 47.50
Tool-use FunctionChatBench 85.96 82.43 88.30 95.70

Lightweight Model Performance

Benchmarks A.X 4.0 Light Qwen3-8B
(w/o reasoning)
Qwen2.5-7B EXAONE-3.5-7.8B Kanana-1.5-8B
Knowledge KMMLU 64.15 63.53 49.56 53.76 48.28
CLIcK 68.05 62.71 60.56 64.30 61.30
KoBALT 30.29 26.57 21.57 21.71 23.14
MMLU 75.43 82.89 75.40 72.20 68.82
General Ko-MT-Bench 79.50 64.06 61.31 81.06 76.30
MT-Bench 81.56 65.69 79.37 83.50 77.60
LiveBench 37.10 50.20 37.00 40.20 29.40
Instruction Following Ko-IFEval 72.99 73.39 60.73 65.01 69.96
IFEval 84.68 85.38 76.73 82.61 80.11
Math HRM8K 40.12 52.50 35.13 31.88 30.87
MATH 68.88 71.48 65.58 63.20 59.28
Code HumanEval+ 75.61 77.44 74.39 76.83 76.83
MBPP+ 67.20 62.17 68.50 64.29 67.99
LiveCodeBench 18.03 23.93 16.62 17.98 16.52

๐Ÿš€ Quickstart

with HuggingFace Transformers

  • transformers>=4.46.0 or the latest version is required to use skt/A.X-4.0-Light
pip install transformers>=4.46.0

Example Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "skt/A.X-4.0-Light"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "system", "content": "๋‹น์‹ ์€ ์‚ฌ์šฉ์ž๊ฐ€ ์ œ๊ณตํ•˜๋Š” ์˜์–ด ๋ฌธ์žฅ๋“ค์„ ํ•œ๊ตญ์–ด๋กœ ๋ฒˆ์—ญํ•˜๋Š” AI ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค."},
    {"role": "user", "content": "The first human went into space and orbited the Earth on April 12, 1961."},
]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(
        input_ids,
        max_new_tokens=128,
        do_sample=False,
    )

len_input_prompt = len(input_ids[0])
response = tokenizer.decode(output[0][len_input_prompt:], skip_special_tokens=True)
print(response)
# Output:
# 1961๋…„ 4์›” 12์ผ, ์ตœ์ดˆ์˜ ์ธ๊ฐ„์ด ์šฐ์ฃผ๋กœ ๋‚˜๊ฐ€ ์ง€๊ตฌ๋ฅผ ๊ณต์ „ํ–ˆ์Šต๋‹ˆ๋‹ค.

with vLLM

  • vllm>=v0.6.4.post1 or the latest version is required to use tool-use function
pip install vllm>=v0.6.4.post1
# if you don't want to activate tool-use function, just commenting out below vLLM option
VLLM_OPTION="--enable-auto-tool-choice --tool-call-parser hermes"
vllm serve skt/A.X-4.0-Light $VLLM_OPTION

Example Usage

from openai import OpenAI

def call(messages, model):
    completion = client.chat.completions.create(
        model=model,
        messages=messages,
    )
    print(completion.choices[0].message)

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="api_key"
)
model = "skt/A.X-4.0-Light"
messages = [{"role": "user", "content": "์—์–ด์ปจ ์—ฌ๋ฆ„์ฒ  ์ ์ • ์˜จ๋„๋Š”? ํ•œ์ค„๋กœ ๋‹ต๋ณ€ํ•ด์ค˜"}]
call(messages, model)
# Output:
# ChatCompletionMessage(content='์—ฌ๋ฆ„์ฒ  ์ ์ • ์—์–ด์ปจ ์˜จ๋„๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ 24-26๋„์ž…๋‹ˆ๋‹ค.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)

messages = [{"role": "user", "content": "What is the appropriate temperature for air conditioning in summer? Response in a single sentence."}]
call(messages, model)
# Output:
# ChatCompletionMessage(content='The appropriate temperature for air conditioning in summer generally ranges from 72ยฐF to 78ยฐF (22ยฐC to 26ยฐC) for comfort and energy efficiency.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)

Examples for tool-use

from openai import OpenAI


def call(messages, model):
    completion = client.chat.completions.create(
        model=model,
        messages=messages,
        tools=tools
    )
    print(completion.choices[0].message)


client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="api_key"
)
model = "skt/A.X-4.0-Light"

calculate_discount = {
    "type": "function",
    "function": {
        "name": "calculate_discount",
        "description": "์›๊ฐ€๊ฒฉ๊ณผ ํ• ์ธ์œจ(ํผ์„ผํŠธ ๋‹จ์œ„)์„ ์ž…๋ ฅ๋ฐ›์•„ ํ• ์ธ๋œ ๊ฐ€๊ฒฉ์„๊ณ„์‚ฐํ•œ๋‹ค.",
        "parameters": {
            "type": "object",
            "properties": {
                "original_price": {
                    "type": "number",
                    "description": "์ƒํ’ˆ์˜ ์›๋ž˜ ๊ฐ€๊ฒฉ"
                },
                "discount_percentage": {
                    "type": "number",
                    "description": "์ ์šฉํ•  ํ• ์ธ์œจ(์˜ˆ: 20% ํ• ์ธ์˜ ๊ฒฝ์šฐ 20์„ ์ž…๋ ฅ)"
                }
            },
            "required": ["original_price", "discount_percentage"]
        }
    }
}
get_exchange_rate = {
    "type": "function",
    "function": {
        "name": "get_exchange_rate",
        "description": "๋‘ ํ†ตํ™” ๊ฐ„์˜ ํ™˜์œจ์„ ๊ฐ€์ ธ์˜จ๋‹ค.",
        "parameters": {
            "type": "object",
            "properties": {
                "base_currency": {
                    "type": "string",
                    "description": "The currency to convert from."
                },
                "target_currency": {
                    "type": "string",
                    "description": "The currency to convert to."
                }
            },
            "required": ["base_currency", "target_currency"]
        }
    }
}
tools = [calculate_discount, get_exchange_rate]

### Slot filling ###
messages = [{"role": "user", "content": "์šฐ๋ฆฌ๊ฐ€ ๋ญ˜ ์‚ฌ์•ผ๋˜๋Š”๋ฐ ์›๋ž˜ 57600์›์ธ๋ฐ ์ง์›ํ• ์ธ ๋ฐ›์„ ์ˆ˜ ์žˆ๊ฑฐ๋“ ? ํ• ์ธ๊ฐ€์ข€ ๊ณ„์‚ฐํ•ด์ค˜"}]
call(messages, model)
# Output:
# ChatCompletionMessage(content='ํ• ์ธ์œจ์„ ์•Œ๋ ค์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)


### Function calling ###
messages = [
    {"role": "user", "content": "์šฐ๋ฆฌ๊ฐ€ ๋ญ˜ ์‚ฌ์•ผ๋˜๋Š”๋ฐ ์›๋ž˜ 57600์›์ธ๋ฐ ์ง์›ํ• ์ธ ๋ฐ›์„ ์ˆ˜ ์žˆ๊ฑฐ๋“ ? ํ• ์ธ๊ฐ€์ข€ ๊ณ„์‚ฐํ•ด์ค˜"},
    {"role": "assistant", "content": "ํ• ์ธ์œจ์„ ์•Œ๋ ค์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?"},
    {"role": "user", "content": "15% ํ• ์ธ ๋ฐ›์„ ์ˆ˜ ์žˆ์–ด."},
]
call(messages, model)
# Output: 
# ChatCompletionMessage(content=None, refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='chatcmpl-tool-7778d1d9fca94bf2acbb44c79359502c', function=Function(arguments='{"original_price": 57600, "discount_percentage": 15}', name='calculate_discount'), type='function')], reasoning_content=None)


### Completion ###
messages = [
    {"role": "user", "content": "์šฐ๋ฆฌ๊ฐ€ ๋ญ˜ ์‚ฌ์•ผ๋˜๋Š”๋ฐ ์›๋ž˜ 57600์›์ธ๋ฐ ์ง์›ํ• ์ธ ๋ฐ›์„ ์ˆ˜ ์žˆ๊ฑฐ๋“ ? ํ• ์ธ๊ฐ€์ข€ ๊ณ„์‚ฐํ•ด์ค˜"},
    {"role": "assistant", "content": "ํ• ์ธ์œจ์„ ์•Œ๋ ค์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?"},
    {"role": "user", "content": "15% ํ• ์ธ ๋ฐ›์„ ์ˆ˜ ์žˆ์–ด."},
    {"role": "tool", "tool_call_id": "random_id", "name": "calculate_discount", "content": "{\"original_price\": 57600, \"discount_percentage\": 15, \"discounted_price\": 48960.0}"}
]
call(messages, model)
# Output: 
# ChatCompletionMessage(content='57600์›์˜ ์ƒํ’ˆ์—์„œ 15% ํ• ์ธ์„ ์ ์šฉํ•˜๋ฉด, ํ• ์ธ๋œ ๊ฐ€๊ฒฉ์€ 48960์›์ž…๋‹ˆ๋‹ค.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)

License

The A.X 4.0 Light models are licensed under Apache License 2.0.

Citation

@article{SKTAdotX4Light,
  title={A.X 4.0 Light},
  author={SKT AI Model Lab},
  year={2025},
  url={https://huggingface.co/skt/A.X-4.0-Light}
}

Contact

Downloads last month
1,186
GGUF
Model size
7.26B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mykor/A.X-4.0-Light-gguf

Base model

skt/A.X-4.0-Light
Quantized
(14)
this model

Evaluation results