LLM Benchmark Platform

Run benchmarks in the background, close the UI, and come back to inspect rankings, task results, logs, and model outputs.

Ranking

Select Tasks

0 tasks selected

Select Models

0 models selected

Start

Jobs always run in the background.

Runs

Available Tasks

Task Prompt

Click a task on the left to view its prompt.

Confirm