AgentRewardBench AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories Paper • 2504.08942 • Published 10 days ago • 25 McGill-NLP/agent-reward-bench Viewer • Updated about 11 hours ago • 1.41k • 1.61k • 2 Running 3 3 Agent Reward Bench Demo 💻 Visualize agent interactions with WebArena tasks Running Agent Reward Bench Leaderboard 🥇 Leaderboard for AgentRewardBench
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories Paper • 2504.08942 • Published 10 days ago • 25
BM25S https://github.com/xhluca/bm25s BM25S: Orders of magnitude faster lexical search via eager sparse scoring Paper • 2407.03618 • Published Jul 4, 2024 • 13 xhluca/bm25s-nq-index Updated Jul 10, 2024 • 8 xhluca/bm25s-arguana-index Updated Jul 13, 2024 • 2 xhluca/bm25s-climate-fever-index Updated Jun 18, 2024
BM25S: Orders of magnitude faster lexical search via eager sparse scoring Paper • 2407.03618 • Published Jul 4, 2024 • 13