DAPO-Coding-Qwen2.5-1.5B-Instruct

This model is a fine-tuned version of Qwen2.5-1.5B-Instruct using DAPO (Direct Alignment via Preference Optimization) for coding tasks.

Model Details

  • Base Model: Qwen/Qwen2.5-1.5B-Instruct
  • Fine-tuning Method: DAPO
  • Training Steps: 500
  • Model Size: 1.5B parameters

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("your-username/DAPO-Coding-Qwen2.5-1.5B-Instruct")
model = AutoModelForCausalLM.from_pretrained("your-username/DAPO-Coding-Qwen2.5-1.5B-Instruct")

# Example usage for code generation
prompt = "Write a Python function to calculate fibonacci numbers:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Training Details

This model was trained using the DAPO framework for alignment with coding preferences. Training was completed at global step 500.

Downloads last month
37
Safetensors
Model size
1.78B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AmberYifan/DAPO-Coding-Qwen2.5-1.5B-Instruct

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(1017)
this model
Quantizations
1 model