AERIOXFLUX
Frontier Labs

Best Benchmarks & Safety Tools (2026)

The top benchmarks & safety tools on Flux, ranked by Flux Score and community signal. 18 tools compared — vote to move them up the board.

Anthropic logo
AnthropicBenchmarks & Safety

Anthropic

Safety-first frontier AI lab behind Claude

Frontier Labs🔥
CERN Large Hadron Collider logo
PhysicsData & Analytics

CERN Large Hadron Collider

The world's largest particle collider

Science
Artificial Analysis logo
Benchmarks & SafetyResearch

Artificial Analysis

New

Independent AI model benchmarking

Frontier Labs
Epoch AI logo
Benchmarks & SafetyResearch

Epoch AI

Research and benchmarks tracking AI progress

Frontier Labs
S
ChallengersBenchmarks & Safety

Safe Superintelligence (SSI)

Ilya Sutskever's lab building safe superintelligence.

Frontier Labs
AI Now Institute logo
Policy & SocietyBenchmarks & Safety

AI Now Institute

New

Research on the social implications of AI

Tech & Culture
AI Safety Institute logo
Policy & SocietyBenchmarks & Safety

AI Safety Institute

New

Government body evaluating frontier AI risks

Tech & Culture
Apollo Research logo
Benchmarks & SafetyResearch

Apollo Research

New

AI deception and scheming evaluation lab

Frontier Labs
Center for AI Safety logo
Policy & SocietyBenchmarks & Safety

Center for AI Safety

New

Nonprofit reducing societal-scale AI risk

Tech & Culture
EU AI Act logo
Policy & SocietyRegulation

EU AI Act

New

The world's first comprehensive AI law

Tech & Culture
LMArena logo
Benchmarks & SafetyResearch

LMArena

New

Crowdsourced human-preference model leaderboard

Frontier Labs
METR logo
Benchmarks & SafetyResearch

METR

New

Independent frontier-model dangerous-capability evaluator

Frontier Labs
MLCommons logo
Benchmarks & SafetyResearch

MLCommons

New

Open engineering consortium behind MLPerf and AI safety benchmarks

Frontier Labs
Partnership on AI logo
Policy & SocietyBenchmarks & Safety

Partnership on AI

New

Multistakeholder AI governance nonprofit

Tech & Culture
Transluce logo
Benchmarks & SafetyResearch

Transluce

New

Open interpretability and AI-oversight nonprofit

Frontier Labs
ARC Prize logo
Benchmarks & SafetyResearch

ARC Prize

New

ARC-AGI benchmark designed to resist memorization and test true generalization.

Frontier Labs
Scale AI logo
Data & AnalyticsBenchmarks & Safety

Scale AI

New

Data labeling and AI evaluation platform; runs the SEAL leaderboards.

AI Tools
SWE-Bench logo
Benchmarks & SafetyCoding

SWE-Bench

New

Benchmark evaluating LLMs on resolving real GitHub software issues.

Frontier Labs

The state of AI, in flux.

The directory + magazine for AI tools and the workflows people use to make money with them.

🔥 The Sauce Drop

The week's highest-earning AI workflows, in your inbox.

Some outbound links are affiliate links — Flux may earn a commission at no cost to you; this never affects rankings. Earnings figures are self-reported and not guarantees of income; most people earn less, some earn nothing.