LLM Benchmark Platform

Run benchmarks in the background, close the UI, and come back to inspect rankings, task results, logs, and model outputs.

Ranking

Runs

Available Tasks

Task Prompt

Click a task on the left to view its prompt.

Confirm