HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher050_em-iter1 Text Generation • 8B • Updated Apr 23 • 5
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter12 Text Generation • 8B • Updated Apr 22 • 6
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter11 Text Generation • 8B • Updated Apr 22 • 5
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter10 Text Generation • 8B • Updated Apr 22 • 5
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter9 Text Generation • 8B • Updated Apr 22 • 5
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter8 Text Generation • 8B • Updated Apr 20 • 5
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter7 Text Generation • 8B • Updated Apr 20 • 5
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter6 Text Generation • 8B • Updated Apr 20 • 5
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter5 Text Generation • 8B • Updated Apr 20 • 5
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter8 Text Generation • 8B • Updated Apr 19 • 5
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter7 Text Generation • 8B • Updated Apr 19 • 5
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter6 Text Generation • 8B • Updated Apr 19 • 5
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter5 Text Generation • 8B • Updated Apr 19 • 5
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter4 Text Generation • 8B • Updated Apr 19 • 5
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter3 Text Generation • 8B • Updated Apr 19 • 5
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter2 Text Generation • 8B • Updated Apr 19 • 5
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter1 Text Generation • 8B • Updated Apr 19 • 5
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter4 Text Generation • 8B • Updated Apr 18 • 5
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter3 Text Generation • 8B • Updated Apr 18 • 5
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter2 Text Generation • 8B • Updated Apr 18 • 5
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter1 Text Generation • 8B • Updated Apr 18 • 5
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter8 Text Generation • 8B • Updated Apr 17 • 6
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter7 Text Generation • 8B • Updated Apr 17 • 6
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter6 Text Generation • 8B • Updated Apr 17 • 6
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter5 Text Generation • 8B • Updated Apr 17 • 6
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter4 Text Generation • 8B • Updated Apr 17 • 5
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter3 Text Generation • 8B • Updated Apr 17 • 5
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter2 Text Generation • 8B • Updated Apr 16 • 5
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter1 Text Generation • 8B • Updated Apr 16 • 5