Evaluate LLMs using Kazakh MC tasks
VLMEvalKit Evaluation Results Collection
Browse and submit language model benchmarks
Display and download auto-evaluation results