Show detailed model outputs for specific benchmarks
A leaderboard to rank large reasoning models
Leaderboard of LLMs based on detailed human feedback
Explore advanced functionalities in a clonable space
Generate a custom benchmark from any document
A vibe-coded horror game where you see with sound.
Ranking of LLMs for agentic tasks
A demo for exploring and analyzing large-scale model repos
A leaderboard for LLMs powering smolagents
Evaluating LLMs on Greek financial tasks
Explore and discover all leaderboards from the HF community
Large Language Diffusion Models
Generate comic book adventures