GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper β’ 2508.06471 β’ Published 14 days ago β’ 155
Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings Paper β’ 2508.00632 β’ Published 21 days ago β’ 3
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm Paper β’ 2507.18553 β’ Published 29 days ago β’ 39
Specification Self-Correction: Mitigating In-Context Reward Hacking Through Test-Time Refinement Paper β’ 2507.18742 β’ Published 29 days ago β’ 5
view article Article Automated Discovery of High-Performance GPU Kernels with OpenEvolve By codelion β’ Jun 27 β’ 21
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization Paper β’ 2507.06181 β’ Published Jul 8 β’ 41
Configurable Preference Tuning βοΈπ Collection CPT uses rubric-guided synthetic data and DPO to enable LLMs to dynamically adjust behavior (e.g., writing style) at inference with system prompts β’ 7 items β’ Updated Jun 17 β’ 1
Configurable Preference Tuning with Rubric-Guided Synthetic Data Paper β’ 2506.11702 β’ Published Jun 13 β’ 2
Training-Free Tokenizer Transplantation via Orthogonal Matching Pursuit Paper β’ 2506.06607 β’ Published Jun 7 β’ 2
Atropos Artifacts Collection A collection of experimental artifacts created with Atropos, Nous' RL Environments framework - https://github.com/NousResearch/Atropos β’ 9 items β’ Updated about 1 month ago β’ 10
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper β’ 2504.20571 β’ Published Apr 29 β’ 97
Perception Encoder: The best visual embeddings are not at the output of the network Paper β’ 2504.13181 β’ Published Apr 17 β’ 35
ReZero: Enhancing LLM search ability by trying one-more-time Paper β’ 2504.11001 β’ Published Apr 15 β’ 15
view article Article Custom Vibe Coding Quest Part 1: The Quest Begins π§ By burtenshaw β’ Mar 26 β’ 10