-
AmberYifan/llama2-7b-sft-ultrachat-safeRLHF
Text Generation • 7B • Updated • 5 -
AmberYifan/mistral-v0.1-7b-sft-ultrachat-safeRLHF
Text Generation • 7B • Updated • 5 -
AmberYifan/Mistral-7B-v0.3-sft-ultrachat-safeRLHF
Text Generation • 7B • Updated • 9 -
AmberYifan/Gemma-2-9B-sft-ultrachat-safeRLHF
Text Generation • 9B • Updated • 5
Yifan Wang
AmberYifan
AI & ML interests
None yet
Recent Activity
published
a model
about 17 hours ago
AmberYifan/Qwen3-4B-OpenR1Math-MARL
published
a model
about 17 hours ago
AmberYifan/Qwen3-4B-OpenR1Math-GRPO
published
a model
5 days ago
AmberYifan/Qwen3-4B-Thinking-Math-GRPO
Organizations
SFT models
-
AmberYifan/llama2-7b-sft-ultrachat-safeRLHF
Text Generation • 7B • Updated • 5 -
AmberYifan/mistral-v0.1-7b-sft-ultrachat-safeRLHF
Text Generation • 7B • Updated • 5 -
AmberYifan/Mistral-7B-v0.3-sft-ultrachat-safeRLHF
Text Generation • 7B • Updated • 9 -
AmberYifan/Gemma-2-9B-sft-ultrachat-safeRLHF
Text Generation • 9B • Updated • 5
Safe SPIN
This collection contains safetyQA dataset for safe SPIN training and trained models
models
534
AmberYifan/Qwen3-4B-OpenR1Math-MARL
Updated
AmberYifan/Qwen3-4B-OpenR1Math-GRPO
Updated
AmberYifan/Qwen3-4B-Thinking-Math-GRPO
Updated
AmberYifan/Qwen2.5-32B-Instruct-wildfeedback-seed-RPO-0.001
Updated
AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-iterDPO-iter2-4k
Text Generation
•
0.0B
•
Updated
•
23
•
1
AmberYifan/Qwen2.5-7B-Instruct-wildfeedback-iterDPO-iter2-4k
Text Generation
•
0.0B
•
Updated
•
25
•
1
AmberYifan/Qwen2.5-14B-Instruct-wildfeedback-RPO-iterDPO-iter1-4k
Text Generation
•
0.0B
•
Updated
•
35
•
1
AmberYifan/Qwen2.5-7B-Instruct-wildfeedback-iterDPO-iter1-4k
Text Generation
•
0.0B
•
Updated
•
28
AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-DRIFT-iter2-RPO
Text Generation
•
0.0B
•
Updated
•
19
AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-spin-iter2-RPO
Text Generation
•
0.0B
•
Updated
•
19
datasets
25
AmberYifan/mistral-v0.1-spin-hhrlhf
Viewer
•
Updated
•
5.5k
•
4
AmberYifan/sft-spin-filter
Updated
AmberYifan/sft-spin-kcenter-5k
Viewer
•
Updated
•
5.5k
•
2
AmberYifan/gsm8k-sft
Viewer
•
Updated
•
8.79k
•
1
AmberYifan/sft-spin-v
Viewer
•
Updated
•
50.5k
•
10
AmberYifan/safeRLHF-SFT
Viewer
•
Updated
•
83.4k
•
1
AmberYifan/SPIN-trans-DPOformat
Viewer
•
Updated
•
55k
•
3
AmberYifan/spin-v-diverse
Viewer
•
Updated
•
55k
•
2
AmberYifan/dpo-v
Viewer
•
Updated
•
55k
•
2
AmberYifan/spin-v
Viewer
•
Updated
•
55k
•
3