Embodied AI

Embodied AI is artificial intelligence that perceives and acts in the physical world through a body — most visibly humanoid robots, but also drones, autonomous vehicles, and robotic arms — rather than operating purely in software. It pairs world models and vision-language-action models with physical actuators.

How it works

Embodied systems combine perception (cameras, sensors), a reasoning model, and motor control. The 2025–26 breakthrough is the vision-language-action (VLA) model — a single generalist model that maps what a robot sees and is told directly to actions. Examples include Figure’s Helix, Physical Intelligence’s open-sourced π0, and NVIDIA’s Cosmos world-model platform used to train them in simulation.

Why it matters

Embodied AI is where the most capital is flowing in the “after the agent” wave: Figure AI raised >$1B at a $39B valuation (Sept 2025), Physical Intelligence $600M at $5.6B (Nov 2025), and 1X opened pre-orders for its $20,000 Neo home robot. CEO timelines (Altman, Musk) increasingly put real-world robots at 2027. See After the Agent.

Related terms: World Model · AI Agent · Frontier Model · All glossary entries