HeroBench: A Benchmark for Long-Horizon Planning and Structured Reasoning in Virtual Worlds Paper • 2508.12782 • Published 5 days ago • 23