LLM Benchmark Platform

Run benchmarks in the background, close the UI, and come back to inspect rankings, task results, logs, and model outputs.

Ranking

Select Tasks

Select Models

Start

Jobs always run in the background.

Runs

Available Tasks

Task Prompt

Select a task.