thenlper tastelikefeet commited on
Commit
389cdda
·
verified ·
1 Parent(s): 0d2ad8e

Support fine-tuning (#35)

Browse files

- Support fine-tuning (1cad2ab3ff41c2671f34e135d29831368ee26b68)


Co-authored-by: tastelikefeet <tastelikefeet@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +40 -0
README.md CHANGED
@@ -5682,6 +5682,46 @@ In addition to the open-source [GTE](https://huggingface.co/collections/Alibaba-
5682
 
5683
  Note that the models behind the commercial APIs are not entirely identical to the open-source models.
5684
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5685
  ## Citation
5686
 
5687
  If you find our paper or models helpful, please consider cite:
 
5682
 
5683
  Note that the models behind the commercial APIs are not entirely identical to the open-source models.
5684
 
5685
+ ## Community support
5686
+
5687
+ ### Fine-tuning
5688
+
5689
+ GTE models can be fine-tuned with a third party framework SWIFT.
5690
+
5691
+ ```shell
5692
+ pip install ms-swift -U
5693
+ ```
5694
+
5695
+ ```shell
5696
+ # check: https://swift.readthedocs.io/en/latest/BestPractices/Embedding.html
5697
+ nproc_per_node=8
5698
+ NPROC_PER_NODE=$nproc_per_node \
5699
+ USE_HF=1 \
5700
+ swift sft \
5701
+ --model Alibaba-NLP/gte-Qwen2-1.5B-instruct \
5702
+ --train_type lora \
5703
+ --dataset 'sentence-transformers/stsb' \
5704
+ --torch_dtype bfloat16 \
5705
+ --num_train_epochs 10 \
5706
+ --per_device_train_batch_size 2 \
5707
+ --per_device_eval_batch_size 1 \
5708
+ --gradient_accumulation_steps $(expr 64 / $nproc_per_node) \
5709
+ --eval_steps 100 \
5710
+ --save_steps 100 \
5711
+ --eval_strategy steps \
5712
+ --use_chat_template false \
5713
+ --save_total_limit 5 \
5714
+ --logging_steps 5 \
5715
+ --output_dir output \
5716
+ --warmup_ratio 0.05 \
5717
+ --learning_rate 5e-6 \
5718
+ --deepspeed zero3 \
5719
+ --dataloader_num_workers 4 \
5720
+ --task_type embedding \
5721
+ --loss_type cosine_similarity \
5722
+ --dataloader_drop_last true
5723
+ ```
5724
+
5725
  ## Citation
5726
 
5727
  If you find our paper or models helpful, please consider cite: