Technical Performance - The AI Index

Last updated: June 2026. Part of the Report AI Library. Every figure links to a primary source.

In summary: Technical Performance tracks how good AI models actually are and what it takes to run them — frontier-model benchmarks, the shift to test-time compute and reasoning, AI infrastructure and data-center energy demand, and AI safety, governance, and incidents.

Reports in this silo

AI Model Benchmarks 2026

SWE-bench jumped from 4.4% to 71.7% in a year (Stanford HAI). MMLU saturated above 92%. US–China gap at 2.7%.

Test-Time Compute & Reasoning Models

Deep Research hit 26.6% on Humanity’s Last Exam. Gemini Deep Think won IMO 2025 gold (35/42).

AI Infrastructure 2026

Data-center electricity: 415 TWh (2024) → ~945 TWh by 2030 (IEA). IDC: AI infra spend $487B in 2026, >$1T by 2029.

AI Safety & Governance 2026

AI incidents, the EU AI Act, US regulation, and responsible-AI adoption — the policy and risk picture.

Other silos

AI Business & Economics — investment, government spending, market forecasts.
Enterprise AI Adoption — organizational adoption, use patterns, generative AI.
Workforce & Labor Impacts — jobs, skills premium, algorithmic management.