--- library_name: peft license: mit base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B tags: - axolotl - generated_from_trainer datasets: - kanhatakeyama/ramdom-to-fixed-multiturn-Calm3 - Aratako/Magpie-Tanuki-Qwen2.5-72B-Answered - Aratako/magpie-qwen2.5-32b-reasoning-100k-formatted - Aratako/magpie-reasoning-llama-nemotron-70b-100k-filtered - Aratako/Open-Platypus-Japanese-masked-formatted - kanhatakeyama/wizardlm8x22b-logical-math-coding-sft_additional-ja - Aratako/magpie-ultra-v0.1-formatted - Aratako/orca-agentinstruct-1M-v1-selected - Aratako/Synthetic-JP-EN-Coding-Dataset-801k-50k model-index: - name: DeepSeek-R1-Distill-Qwen-14B-axolotl-int-v1.0 results: [] --- [Built with Axolotl](https://github.com/axolotl-ai-cloud/axolotl)
See axolotl config axolotl version: `0.8.0.dev0` ```yaml # 学習のベースモデルに関する設定 base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B model_type: AutoModelForCausalLM tokenizer_type: AutoTokenizer # 学習後のモデルのHFへのアップロードに関する設定 hub_model_id: kazuyamaa/DeepSeek-R1-Distill-Qwen-14B-axolotl-int-v1.0 hub_strategy: "end" push_dataset_to_hub: hf_use_auth_token: true # Liger Kernelの設定(学習の軽量・高速化) plugins: - axolotl.integrations.liger.LigerPlugin liger_cross_entropy: false liger_rope: true liger_rms_norm: true liger_swiglu: true liger_fused_linear_cross_entropy: true # 量子化に関する設定 load_in_8bit: false load_in_4bit: true # SFTに利用するchat templateの設定 chat_template: gemma # 学習データセットの前処理に関する設定 datasets: - path: kanhatakeyama/ramdom-to-fixed-multiturn-Calm3 split: 20240806filtered[0:10000] type: chat_template field_messages: messages message_field_role: role message_field_content: content - path: Aratako/Magpie-Tanuki-Qwen2.5-72B-Answered split: train[0:10000] type: chat_template field_messages: messages message_field_role: role message_field_content: content - path: Aratako/magpie-qwen2.5-32b-reasoning-100k-formatted split: train[0:10000] type: chat_template field_messages: conversations message_field_role: role message_field_content: content - path: Aratako/magpie-reasoning-llama-nemotron-70b-100k-filtered split: train[0:10000] type: chat_template field_messages: conversations message_field_role: role message_field_content: content - path: Aratako/Open-Platypus-Japanese-masked-formatted split: train[0:10000] type: chat_template field_messages: conversations message_field_role: role message_field_content: content - path: kanhatakeyama/wizardlm8x22b-logical-math-coding-sft_additional-ja split: train[0:10000] type: chat_template field_messages: messages message_field_role: role message_field_content: content - path: Aratako/magpie-ultra-v0.1-formatted split: train[0:10000] type: chat_template field_messages: conversations message_field_role: role message_field_content: content - path: Aratako/orca-agentinstruct-1M-v1-selected split: train[0:10000] type: chat_template field_messages: messages message_field_role: role message_field_content: content - path: Aratako/Synthetic-JP-EN-Coding-Dataset-801k-50k split: train[0:10000] type: chat_template field_messages: messages message_field_role: role message_field_content: content # データセット、モデルの出力先に関する設定 shuffle_merged_datasets: true dataset_prepared_path: /workspace/data/sft-data output_dir: /workspace/data/models/DeepSeek-R1-Distill-Qwen-14B-axolotl-int-v1.0 # valid datasetのサイズ val_set_size: 0.05 # LoRAに関する設定(フルファインチューニングしたい場合は全て空欄にする) adapter: qlora lora_model_dir: lora_r: 16 lora_alpha: 32 lora_dropout: 0.05 lora_target_linear: true lora_fan_in_fan_out: # wandbに関する設定 wandb_project: axolotl wandb_entity: kazukitakayamas051-securities-companies wandb_watch: wandb_name: sft-lora-1 wandb_log_model: # 学習に関する様々な設定 sequence_len: 4096 sample_packing: true eval_sample_packing: false pad_to_sequence_len: true gradient_accumulation_steps: 16 micro_batch_size: 1 num_epochs: 1 optimizer: paged_adamw_8bit lr_scheduler: cosine cosine_min_lr_ratio: 0.1 learning_rate: 3e-4 train_on_inputs: false group_by_length: false bf16: auto fp16: tf32: false gradient_checkpointing: false early_stopping_patience: auto_resume_from_checkpoints: true local_rank: logging_steps: 1 xformers_attention: flash_attention: true save_strategy: steps save_steps: 50 save_total_limit: 2 warmup_steps: 10 eval_steps: 50 eval_batch_size: 1 eval_table_size: eval_max_new_tokens: debug: deepspeed: /workspace/axolotl/deepspeed_configs/zero3_bf16.json weight_decay: 0.01 fsdp: fsdp_config: special_tokens: pad_token: ```

# DeepSeek-R1-Distill-Qwen-14B-axolotl-int-v1.0 This model is a fine-tuned version of [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) on the kanhatakeyama/ramdom-to-fixed-multiturn-Calm3, the Aratako/Magpie-Tanuki-Qwen2.5-72B-Answered, the Aratako/magpie-qwen2.5-32b-reasoning-100k-formatted, the Aratako/magpie-reasoning-llama-nemotron-70b-100k-filtered, the Aratako/Open-Platypus-Japanese-masked-formatted, the kanhatakeyama/wizardlm8x22b-logical-math-coding-sft_additional-ja, the Aratako/magpie-ultra-v0.1-formatted, the Aratako/orca-agentinstruct-1M-v1-selected and the Aratako/Synthetic-JP-EN-Coding-Dataset-801k-50k datasets. It achieves the following results on the evaluation set: - Loss: 0.6711 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0003 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - distributed_type: multi-GPU - num_devices: 2 - gradient_accumulation_steps: 16 - total_train_batch_size: 32 - total_eval_batch_size: 2 - optimizer: Use paged_adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 10 - num_epochs: 1.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 1.1079 | 0.0015 | 1 | 1.0631 | | 0.8387 | 0.0763 | 50 | 0.7640 | | 0.7109 | 0.1526 | 100 | 0.7312 | | 0.7324 | 0.2289 | 150 | 0.7155 | | 0.8239 | 0.3051 | 200 | 0.7045 | | 0.7019 | 0.3814 | 250 | 0.6967 | | 0.8834 | 0.4577 | 300 | 0.6910 | | 0.7097 | 0.5340 | 350 | 0.6857 | | 0.6659 | 0.6103 | 400 | 0.6821 | | 0.6755 | 0.6866 | 450 | 0.6785 | | 0.6465 | 0.7628 | 500 | 0.6755 | | 0.6697 | 0.8391 | 550 | 0.6735 | | 0.8425 | 0.9154 | 600 | 0.6720 | | 0.6461 | 0.9917 | 650 | 0.6711 | ### Framework versions - PEFT 0.14.0 - Transformers 4.49.0 - Pytorch 2.5.1+cu124 - Datasets 3.2.0 - Tokenizers 0.21.1