Last updated: June 2026. Part of the Report AI Library. Every figure links to a primary source.
In summary: Technical Performance tracks how good AI models actually are and what it takes to run them — frontier-model benchmarks, the shift to test-time compute and reasoning, AI infrastructure and data-center energy demand, and AI safety, governance, and incidents.
Reports in this silo
SWE-bench jumped from 4.4% to 71.7% in a year (Stanford HAI). MMLU saturated above 92%. US–China gap at 2.7%.
Deep Research hit 26.6% on Humanity’s Last Exam. Gemini Deep Think won IMO 2025 gold (35/42).
Data-center electricity: 415 TWh (2024) → ~945 TWh by 2030 (IEA). IDC: AI infra spend $487B in 2026, >$1T by 2029.
AI incidents, the EU AI Act, US regulation, and responsible-AI adoption — the policy and risk picture.
Other silos
- AI Business & Economics — investment, government spending, market forecasts.
- Enterprise AI Adoption — organizational adoption, use patterns, generative AI.
- Workforce & Labor Impacts — jobs, skills premium, algorithmic management.