ldwang
ldwang
AI & ML interests
LLM, MLLM, Infra
Recent Activity
liked
a dataset
2 days ago
nvidia/Nemotron-Pretraining-SFT-v1
liked
a dataset
2 days ago
nvidia/Nemotron-CC-v2
liked
a model
3 days ago
arcee-ai/AFM-4.5B
Organizations
MiscIndustry
MiscR1
MiscBlogs
-
Running582582
Scaling test-time compute
📈Implement test-time compute scaling for math problems
-
Running1.04k1.04k
FineWeb: decanting the web for the finest text data at scale
🍷Generate high-quality web text data for LLM training
-
Running3.1k3.1k
The Ultra-Scale Playbook
🌌The ultimate guide to training LLM on large GPU Clusters
MiscTools
Misc tools for llm & vlm.
MiscAgentic
MiscIndustry
MiscKernel
MiscR1
MiscModels
MiscBlogs
-
Running582582
Scaling test-time compute
📈Implement test-time compute scaling for math problems
-
Running1.04k1.04k
FineWeb: decanting the web for the finest text data at scale
🍷Generate high-quality web text data for LLM training
-
Running3.1k3.1k
The Ultra-Scale Playbook
🌌The ultimate guide to training LLM on large GPU Clusters
MiscDatasets
MiscTools
Misc tools for llm & vlm.