LLM Benchmark Platform

Run benchmarks in the background, close the UI, and come back to inspect rankings, task results, logs, and model outputs.

Ranking

Click a task on the left to view its prompt.