A.X 4.0 Light
๐ค Models | ๐ฌ Chat | ๐ฌ APIs (FREE!) | ๐ฅ๏ธ Github
A.X 4.0 Family Highlights
SK Telecom released A.X 4.0 (pronounced "A dot X"), a large language model (LLM) optimized for Korean-language understanding and enterprise deployment, on July 03, 2025. Built on the open-source Qwen2.5 model, A.X 4.0 has been further trained with large-scale Korean datasets to deliver outstanding performance in real-world business environments.
- Superior Korean Proficiency: Achieved a score of 78.3 on KMMLU, the leading benchmark for Korean-language evaluation and a Korean-specific adaptation of MMLU, outperforming GPT-4o (72.5).
- Deep Cultural Understanding: Scored 83.5 on CLIcK, a benchmark for Korean cultural and contextual comprehension, surpassing GPT-4o (80.2).
- Efficient Token Usage: A.X 4.0 uses approximately 33% fewer tokens than GPT-4o for the same Korean input, enabling more cost-effective and efficient processing.
- Deployment Flexibility: Offered in both a 72B-parameter standard model (A.X 4.0) and a 7B lightweight version (A.X 4.0 Light).
- Long Context Handling: Supports up to 131,072 tokens, allowing comprehension of lengthy documents and conversations. (Lightweight model supports up to 16,384 tokens length)
Performance
Model Performance
Benchmarks | A.X 4.0 | Qwen3-235B-A22B (w/o reasoning) |
Qwen2.5-72B | GPT-4o | |
---|---|---|---|---|---|
Knowledge | KMMLU | 78.32 | 73.64 | 66.44 | 72.51 |
CLIcK | 83.51 | 74.55 | 72.59 | 80.22 | |
KoBALT | 47.30 | 41.57 | 37.00 | 44.00 | |
MMLU | 86.62 | 87.37 | 85.70 | 88.70 | |
General | Ko-MT-Bench | 86.69 | 88.00 | 82.69 | 88.44 |
MT-Bench | 83.25 | 86.56 | 93.50 | 88.19 | |
LiveBench2024.11 | 52.30 | 64.50 | 54.20 | 52.19 | |
Instruction Following | Ko-IFEval | 77.96 | 77.53 | 77.07 | 75.38 |
IFEval | 86.05 | 85.77 | 86.54 | 83.86 | |
Math | HRM8K | 48.55 | 54.52 | 46.37 | 43.27 |
MATH | 74.28 | 72.72 | 77.00 | 72.38 | |
Code | HumanEval+ | 79.27 | 79.27 | 81.71 | 86.00 |
MBPP+ | 73.28 | 70.11 | 75.66 | 75.10 | |
LiveCodeBench2024.10~2025.04 | 26.07 | 33.09 | 27.58 | 29.30 | |
Long Context | LongBench<128K | 56.70 | 49.40 | 45.60 | 47.50 |
Tool-use | FunctionChatBench | 85.96 | 82.43 | 88.30 | 95.70 |
Lightweight Model Performance
Benchmarks | A.X 4.0 Light | Qwen3-8B (w/o reasoning) |
Qwen2.5-7B | EXAONE-3.5-7.8B | Kanana-1.5-8B | |
---|---|---|---|---|---|---|
Knowledge | KMMLU | 64.15 | 63.53 | 49.56 | 53.76 | 48.28 |
CLIcK | 68.05 | 62.71 | 60.56 | 64.30 | 61.30 | |
KoBALT | 30.29 | 26.57 | 21.57 | 21.71 | 23.14 | |
MMLU | 75.43 | 82.89 | 75.40 | 72.20 | 68.82 | |
General | Ko-MT-Bench | 79.50 | 64.06 | 61.31 | 81.06 | 76.30 |
MT-Bench | 81.56 | 65.69 | 79.37 | 83.50 | 77.60 | |
LiveBench | 37.10 | 50.20 | 37.00 | 40.20 | 29.40 | |
Instruction Following | Ko-IFEval | 72.99 | 73.39 | 60.73 | 65.01 | 69.96 |
IFEval | 84.68 | 85.38 | 76.73 | 82.61 | 80.11 | |
Math | HRM8K | 40.12 | 52.50 | 35.13 | 31.88 | 30.87 |
MATH | 68.88 | 71.48 | 65.58 | 63.20 | 59.28 | |
Code | HumanEval+ | 75.61 | 77.44 | 74.39 | 76.83 | 76.83 |
MBPP+ | 67.20 | 62.17 | 68.50 | 64.29 | 67.99 | |
LiveCodeBench | 18.03 | 23.93 | 16.62 | 17.98 | 16.52 |
๐ Quickstart
with HuggingFace Transformers
transformers>=4.46.0
or the latest version is required to useskt/A.X-4.0-Light
pip install transformers>=4.46.0
Example Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "skt/A.X-4.0-Light"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_name)
messages = [
{"role": "system", "content": "๋น์ ์ ์ฌ์ฉ์๊ฐ ์ ๊ณตํ๋ ์์ด ๋ฌธ์ฅ๋ค์ ํ๊ตญ์ด๋ก ๋ฒ์ญํ๋ AI ์ ๋ฌธ๊ฐ์
๋๋ค."},
{"role": "user", "content": "The first human went into space and orbited the Earth on April 12, 1961."},
]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
input_ids,
max_new_tokens=128,
do_sample=False,
)
len_input_prompt = len(input_ids[0])
response = tokenizer.decode(output[0][len_input_prompt:], skip_special_tokens=True)
print(response)
# Output:
# 1961๋
4์ 12์ผ, ์ต์ด์ ์ธ๊ฐ์ด ์ฐ์ฃผ๋ก ๋๊ฐ ์ง๊ตฌ๋ฅผ ๊ณต์ ํ์ต๋๋ค.
with vLLM
vllm>=v0.6.4.post1
or the latest version is required to use tool-use function
pip install vllm>=v0.6.4.post1
# if you don't want to activate tool-use function, just commenting out below vLLM option
VLLM_OPTION="--enable-auto-tool-choice --tool-call-parser hermes"
vllm serve skt/A.X-4.0-Light $VLLM_OPTION
Example Usage
from openai import OpenAI
def call(messages, model):
completion = client.chat.completions.create(
model=model,
messages=messages,
)
print(completion.choices[0].message)
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="api_key"
)
model = "skt/A.X-4.0-Light"
messages = [{"role": "user", "content": "์์ด์ปจ ์ฌ๋ฆ์ฒ ์ ์ ์จ๋๋? ํ์ค๋ก ๋ต๋ณํด์ค"}]
call(messages, model)
# Output:
# ChatCompletionMessage(content='์ฌ๋ฆ์ฒ ์ ์ ์์ด์ปจ ์จ๋๋ ์ผ๋ฐ์ ์ผ๋ก 24-26๋์
๋๋ค.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)
messages = [{"role": "user", "content": "What is the appropriate temperature for air conditioning in summer? Response in a single sentence."}]
call(messages, model)
# Output:
# ChatCompletionMessage(content='The appropriate temperature for air conditioning in summer generally ranges from 72ยฐF to 78ยฐF (22ยฐC to 26ยฐC) for comfort and energy efficiency.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)
Examples for tool-use
from openai import OpenAI
def call(messages, model):
completion = client.chat.completions.create(
model=model,
messages=messages,
tools=tools
)
print(completion.choices[0].message)
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="api_key"
)
model = "skt/A.X-4.0-Light"
calculate_discount = {
"type": "function",
"function": {
"name": "calculate_discount",
"description": "์๊ฐ๊ฒฉ๊ณผ ํ ์ธ์จ(ํผ์ผํธ ๋จ์)์ ์
๋ ฅ๋ฐ์ ํ ์ธ๋ ๊ฐ๊ฒฉ์๊ณ์ฐํ๋ค.",
"parameters": {
"type": "object",
"properties": {
"original_price": {
"type": "number",
"description": "์ํ์ ์๋ ๊ฐ๊ฒฉ"
},
"discount_percentage": {
"type": "number",
"description": "์ ์ฉํ ํ ์ธ์จ(์: 20% ํ ์ธ์ ๊ฒฝ์ฐ 20์ ์
๋ ฅ)"
}
},
"required": ["original_price", "discount_percentage"]
}
}
}
get_exchange_rate = {
"type": "function",
"function": {
"name": "get_exchange_rate",
"description": "๋ ํตํ ๊ฐ์ ํ์จ์ ๊ฐ์ ธ์จ๋ค.",
"parameters": {
"type": "object",
"properties": {
"base_currency": {
"type": "string",
"description": "The currency to convert from."
},
"target_currency": {
"type": "string",
"description": "The currency to convert to."
}
},
"required": ["base_currency", "target_currency"]
}
}
}
tools = [calculate_discount, get_exchange_rate]
### Slot filling ###
messages = [{"role": "user", "content": "์ฐ๋ฆฌ๊ฐ ๋ญ ์ฌ์ผ๋๋๋ฐ ์๋ 57600์์ธ๋ฐ ์ง์ํ ์ธ ๋ฐ์ ์ ์๊ฑฐ๋ ? ํ ์ธ๊ฐ์ข ๊ณ์ฐํด์ค"}]
call(messages, model)
# Output:
# ChatCompletionMessage(content='ํ ์ธ์จ์ ์๋ ค์ฃผ์๊ฒ ์ต๋๊น?', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)
### Function calling ###
messages = [
{"role": "user", "content": "์ฐ๋ฆฌ๊ฐ ๋ญ ์ฌ์ผ๋๋๋ฐ ์๋ 57600์์ธ๋ฐ ์ง์ํ ์ธ ๋ฐ์ ์ ์๊ฑฐ๋ ? ํ ์ธ๊ฐ์ข ๊ณ์ฐํด์ค"},
{"role": "assistant", "content": "ํ ์ธ์จ์ ์๋ ค์ฃผ์๊ฒ ์ต๋๊น?"},
{"role": "user", "content": "15% ํ ์ธ ๋ฐ์ ์ ์์ด."},
]
call(messages, model)
# Output:
# ChatCompletionMessage(content=None, refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='chatcmpl-tool-7778d1d9fca94bf2acbb44c79359502c', function=Function(arguments='{"original_price": 57600, "discount_percentage": 15}', name='calculate_discount'), type='function')], reasoning_content=None)
### Completion ###
messages = [
{"role": "user", "content": "์ฐ๋ฆฌ๊ฐ ๋ญ ์ฌ์ผ๋๋๋ฐ ์๋ 57600์์ธ๋ฐ ์ง์ํ ์ธ ๋ฐ์ ์ ์๊ฑฐ๋ ? ํ ์ธ๊ฐ์ข ๊ณ์ฐํด์ค"},
{"role": "assistant", "content": "ํ ์ธ์จ์ ์๋ ค์ฃผ์๊ฒ ์ต๋๊น?"},
{"role": "user", "content": "15% ํ ์ธ ๋ฐ์ ์ ์์ด."},
{"role": "tool", "tool_call_id": "random_id", "name": "calculate_discount", "content": "{\"original_price\": 57600, \"discount_percentage\": 15, \"discounted_price\": 48960.0}"}
]
call(messages, model)
# Output:
# ChatCompletionMessage(content='57600์์ ์ํ์์ 15% ํ ์ธ์ ์ ์ฉํ๋ฉด, ํ ์ธ๋ ๊ฐ๊ฒฉ์ 48960์์
๋๋ค.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)
License
The A.X 4.0 Light
models are licensed under Apache License 2.0
.
Citation
@article{SKTAdotX4Light,
title={A.X 4.0 Light},
author={SKT AI Model Lab},
year={2025},
url={https://huggingface.co/skt/A.X-4.0-Light}
}
Contact
- Business & Partnership Contact: a.x@sk.com
- Downloads last month
- 1,186
Hardware compatibility
Log In
to view the estimation
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
32-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for mykor/A.X-4.0-Light-gguf
Base model
skt/A.X-4.0-LightEvaluation results
- exact_match on mmlu (chat CoT)self-reported75.430
- exact_match on kmmlu (chat CoT)self-reported64.150