ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks Paper • 2503.06885 • Published Mar 10 • 3 • 3