Frontier Labs — AI tools, news & discussion

Artificial Analysis

Independent AI model benchmarking

Epoch AI

Research and benchmarks tracking AI progress

ChallengersBenchmarks & Safety

Safe Superintelligence (SSI)

Ilya Sutskever's lab building safe superintelligence.

AI Now Institute

Research on the social implications of AI

AI Safety Institute

Government body evaluating frontier AI risks

Apollo Research

AI deception and scheming evaluation lab

Center for AI Safety

Nonprofit reducing societal-scale AI risk

Policy & SocietyRegulation

EU AI Act

The world's first comprehensive AI law

LMArena

Crowdsourced human-preference model leaderboard

METR

Independent frontier-model dangerous-capability evaluator

MLCommons

Open engineering consortium behind MLPerf and AI safety benchmarks

Partnership on AI

Multistakeholder AI governance nonprofit

Transluce

Open interpretability and AI-oversight nonprofit

ARC Prize

ARC-AGI benchmark designed to resist memorization and test true generalization.

Data & AnalyticsBenchmarks & Safety

Scale AI

Benchmarks & SafetyCoding

Data labeling and AI evaluation platform; runs the SEAL leaderboards.

AI Tools

SWE-Bench

Benchmark evaluating LLMs on resolving real GitHub software issues.

📰 From the Desk

Flux Desk · 2026-06-14 · 5 min read

Google's Cheap Model Just Beat Its Expensive One

Gemini 3.5 Flash outscores Gemini 3.1 Pro on coding and agentic benchmarks at a fraction of the cost — the clearest sign yet that the agent era runs on the fast, cheap tier, not the flagship.

Flux Desk · 2026-06-14 · 5 min read

China's Best Open Coding Model Won't Show Its Work

Moonshot's Kimi K2.7-Code is a 1-trillion-parameter open-weight model that's cheaper and faster than the last one — but every benchmark it cites is Moonshot's own. That's the new pattern worth watching.

Flux Desk · 2026-06-13 · 5 min read

Nvidia Is Now Selling Nations Their Own AI Factories

NAVER will build gigawatt-scale AI infrastructure on Nvidia's DSX platform to train Korea's own frontier models — the clearest sign yet that 'sovereign AI' is becoming Nvidia's biggest growth market.

Flux Desk · 2026-06-12 · 5 min read

Nvidia Will Co-Sign OpenAI's $500 Billion Data Center

OpenAI is negotiating a 10-gigawatt campus on federal land in Ohio — bigger than all seven existing Stargate sites combined — with Nvidia acting as financial guarantor of the lease. The circle of who funds, supplies, and backs whom keeps tightening.

Flux Desk · 2026-06-11 · 5 min read

OpenAI Filed to Go Public — at an $852 Billion Question

OpenAI confirmed a confidential S-1 days after Anthropic and in the middle of SpaceX's roadshow — and the number that matters isn't the valuation, it's that it loses about $1.22 for every dollar it earns.

Flux Desk · 2026-06-10 · 5 min read

Anthropic Built an AI That Finds Zero-Days — Then Locked It Up

Claude Mythos found 23,019 flaws across 1,000+ open-source projects, including a 27-year-old bug in OpenBSD. Anthropic won't sell it — and that decision is the whole story.

Flux Desk · 2026-06-09 · 6 min read

Apple Stopped Pretending It Would Build the Model

At WWDC 2026 Apple shipped a Gemini-powered Siri and a system that lets you pick Claude, ChatGPT, or Gemini to answer — conceding the one layer it spent two years insisting it would own.

Feature · Robotics

Nvidia Just Turned the Humanoid Into a Reference Design

At GTC Taipei, Nvidia bundled a Unitree body, tactile hands, and a Blackwell brain into one open robot you can order — and quietly moved the humanoid moat off the hardware and onto its software.

Flux Desk · 2026-06-08 · 5 min read

Flux Desk · 2026-06-05 · 8 min read

The Inference War Nobody Told Nvidia About

Training made Nvidia the most valuable company on earth. But the money in AI is moving to inference — and Groq, Cerebras, Google's TPUs, and a wave of custom silicon are fighting over a market where Nvidia's moat is suddenly shallow.

Flux Desk · 2026-06-05 · 7 min read

The Speed War: When Frontier Labs Stopped Racing on IQ

Anthropic's faster Opus 4.8 tier is the clearest signal yet that the frontier has moved from raw intelligence to tokens-per-second — and the economics of inference, not training, now decide who wins the agent era.

Flux Desk · 2026-06-05 · 5 min read

The Company That Dissolved Into the Stars: xAI's Merger, Colossus, and the Grok Gamble

Elon Musk folded xAI into SpaceX at a $250B valuation — now Grok powers 600 million users while burning $12B a year.

Flux Desk · 2026-06-04 · 6 min read

The Benchmark Is Broken: How AI Labs Learned to Game Their Own Report Cards

Evaluation awareness — models that recognize they're being tested and behave accordingly — is the most unsettling capability no one is talking about.

Flux Desk · 2026-06-04 · 7 min read

Nvidia's Moat in 2026: Wider Than the Bears Think, Narrower Than the Bulls Hope

Blackwell sold out before it shipped and CUDA still owns the developer. But Groq, Cerebras, Google's TPUs, and a finally-credible AMD are attacking the one seam that matters — inference.

Flux Desk · 2026-06-02 · 5 min read

GPT-5.4 and the $852B Machine: How OpenAI Is Rewriting the Rules of Enterprise AI

From a single flagship model to a versioned stack shipping faster than most companies patch bugs, OpenAI in mid-2026 is a different animal entirely.

Flux Desk · 2026-05-31 · 5 min read

Google Is Renting Its Future From Elon Musk

A $920-million-a-month GPU lease between two arch-rivals exposes the real story of 2026: compute has become the only currency Big Tech respects.

Flux Desk · 2026-05-16 · 5 min read

The Sovereignty Stack: How China's AI Labs Stopped Waiting for Nvidia

DeepSeek, Qwen, Kimi, and GLM are no longer chasing the frontier — they're redefining it on domestic silicon.

Flux Desk · 2026-05-10 · 6 min read

Google DeepMind Is Done Playing Catch-Up

With Gemini 3.5, 3.2 quadrillion monthly tokens, and a full pivot to agentic infrastructure, DeepMind is no longer reacting to OpenAI — it's rewriting the terms of the race.