File size: 3,989 Bytes

0920b90
 
 
fff50e1
 
 
 
 
 
 
 
 
0920b90
 
c192c50
0920b90
fff50e1
0920b90
fff50e1
 
0920b90
fff50e1
 
 
 
 
 
 
 
0920b90
 
 
 
 
 
fff50e1
0920b90
fff50e1
 
 
 
0920b90
 
 
fff50e1
 
 
 
0920b90
 
 
fff50e1
 
 
0920b90
 
 
 
fff50e1
 
0920b90
fff50e1
 
 
3643634
fff50e1
0920b90
fff50e1
0920b90
f2eed85
 
0920b90
fff50e1
0920b90
fff50e1
 
165a185
f7aa0a9
 
fff50e1
 
 
0920b90
fff50e1
 
0920b90
fff50e1
0920b90
fff50e1
0920b90
fff50e1
 
0920b90
1b4be2e
 
0920b90
 
fff50e1
0920b90
fff50e1
 
 
 
 
 
 
 
 
 
 
 
7910d6d
fff50e1
494db16
0920b90
fff50e1
 
 
0920b90
 
 
15291e5
0920b90
5035d2f
 
15291e5
5035d2f
0920b90

---
base_model: unsloth/Qwen2.5-1.5B-Instruct
library_name: peft
license: mit
datasets:
- Rustamshry/medical_o1_reasoning_SFT_az
language:
- az
pipeline_tag: question-answering
tags:
- biology
- medical
---

# Model Card for Qwen2.5-1.5B-Medical-Az

### Model Description

This model is a fine-tuned version of Qwen2.5-1.5B-Instruct on an Azerbaijani medical reasoning dataset.
It is designed to understand complex medical instructions, interpret clinical cases, and generate informed answers in Azerbaijani.

- **Developed by:** Rustam Shiriyev
- **Model type:** Causal Language Model
- **Language(s) (NLP):** Azerbaijani
- **License:** MIT
- **Finetuned from model:** unsloth/Qwen2.5-1.5B-Instruct
- **Fine-tuning Method:** Supervised Fine-Tuning (SFT) using Unsloth + LoRA
- **Domain:** Medical Question Answering / Reasoning
- **Dataset:** The training data consists of ~19,696 rows, translated from the FreedomIntelligence/medical-o1-reasoning-SFT dataset


## Uses

### Direct Use

You can use this model directly for:

- Medical QA tasks in Azerbaijani
- Evaluating LLMs' ability to reason about clinical data in low-resource languages
- Generating educational prompts or tutoring-style medical answers
- Research on instruction tuning and localization of medical language models

### Out-of-Scope Use

- Use in life-critical medical applications
- Any application where incorrect answers could cause harm
- Use by patients or non-medical professionals for self-diagnosis
- Deployment in commercial healthcare systems without regulatory oversight or expert validation

## Bias, Risks, and Limitations

The model has not been clinically validated and must not be used for real medical decision-making.
Trained only on a single-source dataset, so it may not generalize to all medical topics.
Performance in zero-shot generalisation (e.g., English → Azerbaijani medical transfer) has not been tested.


## How to Get Started with the Model

```python
login(token="")  

tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen2.5-1.5B-Instruct",)
base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Qwen2.5-1.5B-Instruct",
    device_map={"": 0}, token=""
)

model = PeftModel.from_pretrained(base_model,"Rustamshry/Qwen2.5-1.5B-Medical-Az")

question = "45 yaşlı kişi qəfil danışıqda pozulma, yeriyişində dəyişiklik və titrəmə meydana gəlir. Ən ehtimal diaqnoz nədir?"
prompt = f"""### Question:\n{question}\n\n### Response:\n"""

input_ids = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **input_ids, 
    max_new_tokens=2000,
    #temperature=0.6,
    #top_p=0.95,
    #do_sample=True,
    #eos_token_id=tokenizer.eos_token_id
)

print(tokenizer.decode(outputs[0]))
```

## Training Details

### Training Data

The model was fine-tuned on a translated and cleaned version of FreedomIntelligence/medical-o1-reasoning-SFT, which was manually converted into Azerbaijani. 
All examples were filtered for translation quality and medical relevance.

- Dataset (Translated): Rustamshry/medical_o1_reasoning_SFT_az 
- Link of Original Dataset: huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT


### Training Procedure

The model was trained using supervised fine-tuning (SFT) with parameter-efficient fine-tuning (PEFT) via LoRA, using the Unsloth library for memory-optimized training.

- **Training regime:** fp16
- **Epochs:** 2
- **Batch size:** 2
- **Gradient accumulation steps:** 4
- **Max sequence lenght:** 2000
- **Learning rate:** 2e-5
- **Optimizer:** adamw_torch
- **fp16:** True
- **LoRa rank:** 6
- **Aplha:** 16
- **Target Modules:** 28 layers with 28 QKV, 28 O, and 28 MLP.

#### Speeds, Sizes, Times

- **Training speed:** 0.12 steps/sec
- **Total training time:** 11 hours, 26 minutes
- **Total training steps:** 4924

#### Hardware

- **GPUs Used:**. NVIDIA Tesla T4 GPUs via Kaggle Notebook

#### Result

- **Training loss:** 2.68 → 1.63

### Framework versions

- PEFT 0.14.0