File size: 2,248 Bytes
3209fd4
 
 
981eaa2
 
 
 
 
 
3209fd4
 
 
 
 
 
 
981eaa2
 
3209fd4
981eaa2
 
 
3209fd4
 
 
 
 
 
981eaa2
3209fd4
 
 
 
 
981eaa2
 
 
 
3209fd4
981eaa2
3209fd4
981eaa2
 
 
 
 
3209fd4
981eaa2
3209fd4
 
484de70
3209fd4
981eaa2
 
 
 
3209fd4
981eaa2
3209fd4
981eaa2
 
 
 
 
 
 
 
3209fd4
981eaa2
 
3209fd4
981eaa2
3209fd4
981eaa2
3209fd4
981eaa2
3209fd4
981eaa2
 
3209fd4
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
base_model: unsloth/Qwen2.5-3B-Instruct
library_name: peft
license: mit
datasets:
- CausalLM/GPT-4-Self-Instruct-Turkish
language:
- tr
pipeline_tag: question-answering
---

# Model Card for Model ID


### Model Description

This model is a fine-tuned version of Qwen2.5-3B-Instruct, optimized for Turkish instruction-following tasks.
Leveraging the CausalLM/GPT-4-Self-Instruct-Turkish dataset, the model has been trained to understand and respond to a wide range of Turkish prompts, enhancing its capabilities in tasks such as question answering

- **Language(s) (NLP):** Turkish
- **License:** MIT
- **Finetuned from model:** unsloth/Qwen2.5-3B-Instruct


## Uses

### Direct Use

This model is intended for applications requiring Turkish language understanding and generation, particularly in instruction-following scenarios.

## How to Get Started with the Model

Use the code below to get started with the model.

```python
from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

login(token="")

tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen2.5-3B-Instruct",)
base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Qwen2.5-3B-Instruct",
    device_map="auto", token=""
)

model = PeftModel.from_pretrained(base_model,"Rustamshry/Qwen2.5-3B-Self-Instruct-Turkish")


question = "Türkiye'deki sağlık hizmetleri ve hastaneler hakkında genel bir özet oluşturun."

prompt = (
    f"### Soru:\n{question}\n\n"
    f"### Cevap:\n"
)

input_ids = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **input_ids,
    max_new_tokens=2048,
    #temperature=0.6,
    #top_p=0.95,
    #do_sample=True,
    #eos_token_id=tokenizer.eos_token_id
)

print(tokenizer.decode(outputs[0]),skip_special_tokens=True)
```

## Training Details

### Training Data

- Dataset: CausalLM/GPT-4-Self-Instruct-Turkish

- Description: A collection of Turkish instruction-response pairs generated using the Self-Instruct methodology, where GPT-4 was employed to create synthetic instruction data. 
  This approach aims to improve the model's ability to follow diverse and complex instructions in Turkish.

### Framework versions

- PEFT 0.15.2