1 21 55

sun

sunmaxim

AI & ML interests

LLM for Teaching&Student

Recent Activity

upvoted a collection 9 days ago

InternVL3

liked a model 13 days ago

OpenGVLab/InternVL3-8B

View all activity

Organizations

None yet

sunmaxim's activity

upvoted a collection 9 days ago

InternVL3

Collection

34 items • Updated 4 days ago • 54

upvoted an article 17 days ago

Article

Welcome Llama 4 Maverick & Scout on Hugging Face!

19 days ago

• 140

upvoted a paper 17 days ago

Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published Jan 31 • 22

upvoted a paper about 1 month ago

Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond

Paper • 2503.10460 • Published Mar 13 • 28

upvoted a collection about 1 month ago

OLMo 2

Collection

Artifacts for the second set of OLMo models. • 27 items • Updated Mar 20 • 109

upvoted an article about 2 months ago

Article

Improving Hugging Face Training Efficiency Through Packing with Flash Attention

Aug 21, 2024

• 33

upvoted 4 papers 2 months ago

upvoted an article 2 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.22k

upvoted a collection 2 months ago

high-quality Chinese training datasets

Collection

a suite of high-quality Chinese datasets, used for pretraining, fine-tuning or preference alignment. And the models trained on these datasets. • 13 items • Updated Mar 11 • 12

upvoted 2 papers 4 months ago

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

Paper • 2407.21787 • Published Jul 31, 2024 • 13

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 365

upvoted a paper 5 months ago

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

Paper • 2411.16489 • Published Nov 25, 2024 • 49