Gemini Robotics 1.5: AI Agents Enter the Physical World

Google DeepMind unveils new models for more intelligent, versatile robots.

Google DeepMind has introduced Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, two new AI models designed to empower robots with advanced perception, planning, and action capabilities. These models aim to enable robots to tackle complex, multi-step tasks in the real world more effectively and transparently.

Mark Ellison

By Mark Ellison

September 26, 2025

3 min read

Gemini Robotics 1.5: AI Agents Enter the Physical World

Key Facts

  • Google DeepMind introduced Gemini Robotics 1.5 (VLA model) and Gemini Robotics-ER 1.5 (VLM model).
  • These models enable robots to perceive, plan, think, use tools, and act for complex tasks.
  • Gemini Robotics 1.5 translates visual information and instructions into motor commands.
  • Gemini Robotics-ER 1.5 reasons about the physical world and creates multi-step plans.
  • Gemini Robotics-ER 1.5 is now available to developers via the Gemini API.

Why You Care

Ever wondered when robots would move beyond simple, repetitive tasks and truly understand the world around them? What if your robot could not only sort your recycling but also explain why it chose each bin? Google DeepMind’s latest advancements with Gemini Robotics 1.5 are bringing us closer to that reality, making robots smarter and more capable in your daily life.

What Actually Happened

Google DeepMind has unveiled two significant new AI models: Gemini Robotics 1.5 and Gemini Robotics-ER 1.5. These models are designed to enable robots to perceive, plan, think, use tools, and act, as mentioned in the release. They aim to better solve complex, multi-step tasks. Gemini Robotics 1.5 is a vision-language-action (VLA) model, which translates visual information and instructions into motor commands for a robot, according to the announcement. Meanwhile, Gemini Robotics-ER 1.5 is a vision-language model (VLM) that reasons about the physical world and creates detailed, multi-step plans, the team revealed.

These models unlock “agentic experiences” with thinking capabilities for robots, as detailed in the blog post. This means robots can now process information and make decisions more autonomously. The goal is to build more capable and versatile robots that actively understand their environment, the company reports.

Why This Matters to You

These new models mean your future robots could handle much more than just vacuuming. They can perform tasks requiring contextual understanding and multiple steps. For example, imagine asking a robot to “sort these objects into the correct compost, recycling, and trash bins based on my location.” The robot would need to search for local guidelines, visually identify objects, and then execute the sorting. This is a complex chain of reasoning and action.

How will these advancements change your interaction with system?

Key Capabilities of Gemini Robotics Models:
* Perceive: Understand visual and linguistic information.
* Plan: Create detailed, multi-step action sequences.
* Think: Reason about actions and explain decisions.
* Act: Translate plans into physical motor commands.
* Tool Use: Natively call digital tools for information.

As the company states, “Most daily tasks require contextual information and multiple steps to complete, making them notoriously challenging for robots today.” These new models directly address that challenge. Gemini Robotics-ER 1.5 is already available to developers via the Gemini API, meaning new applications could emerge sooner than you think.

The Surprising Finding

One particularly interesting aspect of these new models is their ability to explain their decision-making process. Gemini Robotics 1.5 can “even explain its thinking processes in natural language — making its decisions more transparent,” as mentioned in the release. This is a significant shift from traditional robotics, where a robot’s actions often feel like a black box. This transparency allows for greater trust and easier debugging. It challenges the assumption that AI must always operate without human-understandable reasoning. The model thinks before taking action and shows its process, helping robots assess and complete complex tasks more transparently, the announcement states.

What Happens Next

With Gemini Robotics-ER 1.5 now available to developers via the Gemini API, we can expect to see new robotic applications emerge in the coming months. Developers will start integrating these planning and reasoning capabilities into their robot designs. For example, future robots in warehouses could autonomously reconfigure layouts based on supply chain changes. In homes, robots might adapt to your specific tidying preferences by understanding your natural language instructions. These advancements will lead to more intelligent robots in various industries. The industry implications are vast, from logistics to elder care. We are moving towards a future where robots are not just tools but intelligent agents capable of complex interactions.

Ready to start creating?

Create Voiceover

Transcribe Speech

Create Dialogues

Create Visuals

Clone a Voice