Online_Mind2Web_Leaderboard / human_Mind2Web-Online - Leaderboard_data.csv
WeijianQi1999's picture
update Operator model name
e3f57bf
raw
history blame
462 Bytes
Agent,Model,Organization,Source,Easy,Medium,Hard,Average SR,Date
Operator,OpenAI Computer-Using Agent,OpenAI,OSU NLP,83.1,58.0,43.2,61.3,2025-3-22
SeeAct,gpt-4o-2024-08-06,OSU,OSU NLP,60.2,25.2,8.1,30.7,2025-3-22
Browser Use,gpt-4o-2024-08-06,Browser Use,OSU NLP,55.4,26.6,8.1,30.0,2025-3-22
Claude Computer Use,claude-3-5-sonnet-20241022,Anthropic,OSU NLP,56.6,20.3,14.9,29.0,2025-3-22
Agent-E,gpt-4o-2024-08-06,Emergence AI,OSU NLP,49.4,26.6,6.8,28.0,2025-3-22