papers - a liheeem Collection

liheeem 's Collections

papers

papers

updated 9 days ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 256
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published Jun 5 • 130
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Paper • 2506.05176 • Published Jun 5 • 68
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 270
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

Paper • 2506.03143 • Published Jun 3 • 51
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 177
SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning

Paper • 2506.01713 • Published Jun 2 • 47
Large Language Models for Data Synthesis

Paper • 2505.14752 • Published May 20 • 50
ZeroGUI: Automating Online GUI Learning at Zero Human Cost

Paper • 2505.23762 • Published May 29 • 46
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28 • 129
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Paper • 2505.22453 • Published May 28 • 46
UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents

Paper • 2505.21496 • Published May 27 • 39
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 89
Synthetic Data RL: Task Definition Is All You Need

Paper • 2505.17063 • Published May 18 • 10
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning

Paper • 2505.16410 • Published May 22 • 57
WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning

Paper • 2505.16421 • Published May 22 • 19
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Paper • 2505.15277 • Published May 21 • 103
Efficient Agent Training for Computer Use

Paper • 2505.13909 • Published May 20 • 45
MMSearch-R1: Incentivizing LMMs to Search

Paper • 2506.20670 • Published Jun 25 • 62
A Survey of Context Engineering for Large Language Models

Paper • 2507.13334 • Published Jul 17 • 245
GUI-G^2: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21 • 131
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs

Paper • 2507.09477 • Published Jul 13 • 80
VeriGUI: Verifiable Long-Chain GUI Dataset

Paper • 2508.04026 • Published 17 days ago • 139
Phi-Ground Tech Report: Advancing Perception in GUI Grounding

Paper • 2507.23779 • Published 23 days ago • 42
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

Paper • 2508.04700 • Published 17 days ago • 47
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Paper • 2507.21046 • Published 26 days ago • 79
Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents

Paper • 2508.01858 • Published 20 days ago • 20
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published 13 days ago • 83
OpenCUA: Open Foundations for Computer-Use Agents

Paper • 2508.09123 • Published 11 days ago • 27