LLM Benchmark Platform
Run benchmarks in the background, close the UI, and come back to inspect rankings, task results, logs, and model outputs.
Ranking
Select Tasks
Select Models
Start
Jobs always run in the background.
Runs
Available Tasks
Task Prompt
Select a task.