DAPO-Coding-Qwen2.5-1.5B-Instruct
This model is a fine-tuned version of Qwen2.5-1.5B-Instruct using DAPO (Direct Alignment via Preference Optimization) for coding tasks.
Model Details
- Base Model: Qwen/Qwen2.5-1.5B-Instruct
- Fine-tuning Method: DAPO
- Training Steps: 500
- Model Size: 1.5B parameters
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("your-username/DAPO-Coding-Qwen2.5-1.5B-Instruct")
model = AutoModelForCausalLM.from_pretrained("your-username/DAPO-Coding-Qwen2.5-1.5B-Instruct")
# Example usage for code generation
prompt = "Write a Python function to calculate fibonacci numbers:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Training Details
This model was trained using the DAPO framework for alignment with coding preferences. Training was completed at global step 500.
- Downloads last month
- 37
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support