HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter12 Text Generation • Updated about 7 hours ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter11 Text Generation • Updated about 7 hours ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter10 Text Generation • Updated about 7 hours ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter9 Text Generation • Updated about 7 hours ago
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-a0.001-b2.0-iter12 Text Generation • Updated 1 day ago • 1
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-a0.001-b2.0-iter11 Text Generation • Updated 1 day ago • 1
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-a0.001-b2.0-iter10 Text Generation • Updated 1 day ago • 1
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter9 Text Generation • Updated 2 days ago • 10
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter8 Text Generation • Updated 2 days ago • 81
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter7 Text Generation • Updated 2 days ago • 5
HanningZhang/scalebio_reasoning_think_220k_with_system_and_cot Viewer • Updated about 21 hours ago • 193k
HanningZhang/scalebio_reasoning_nonthink_50k_with_system_and_cot Viewer • Updated 3 days ago • 50k • 50
HanningZhang/scalebio_reasoning_nonthink_20k_with_system_and_cot Viewer • Updated 3 days ago • 20k • 60