Gemini AI Enters the Physical World with New Robotics Models

Google DeepMind unveils Gemini Robotics and Gemini Robotics-ER, bringing advanced AI to physical robots.

Google DeepMind has launched two new AI models, Gemini Robotics and Gemini Robotics-ER, based on Gemini 2.0. These models are designed to enable robots to perform complex real-world tasks through embodied reasoning and direct physical control, moving AI beyond the digital realm.

Mark Ellison

By Mark Ellison

December 5, 2025

4 min read

Gemini AI Enters the Physical World with New Robotics Models

Key Facts

  • Google DeepMind introduced Gemini Robotics and Gemini Robotics-ER, two new AI models based on Gemini 2.0.
  • Gemini Robotics is a vision-language-action (VLA) model that directly controls robots through physical actions.
  • Gemini Robotics-ER features advanced spatial understanding for embodied reasoning.
  • These models enable robots to perform a wider range of real-world tasks.
  • Google DeepMind is partnering with Apptronik to build humanoid robots using Gemini 2.0.

Why You Care

Ever wonder when robots will actually do things in the real world, not just in labs or factories? What if your future robot assistant could understand your messy kitchen and tidy it up?

Google DeepMind recently announced a significant step forward in making this a reality. They introduced Gemini Robotics, a new collection of AI models. This creation aims to empower robots with human-like comprehension and action. It means your interactions with robots could become much more natural and helpful very soon.

What Actually Happened

Google DeepMind has unveiled two new AI models, both built upon the Gemini 2.0 architecture, according to the announcement. These models are designed to extend artificial intelligence capabilities into the physical world. The first, named Gemini Robotics, is a vision-language-action (VLA) model. It uses physical actions as a new output modality, allowing direct control over robots.

The second model is Gemini Robotics-ER. This version incorporates spatial understanding. It enables roboticists to develop programs leveraging Gemini’s embodied reasoning (ER) abilities. Embodied reasoning refers to a robot’s capacity to understand and react to its physical surroundings. The company reports that both models significantly expand the range of real-world tasks robots can perform. Google DeepMind is also partnering with Apptronik to integrate Gemini 2.0 into humanoid robots.

Why This Matters to You

These new Gemini models are not just for researchers; they have practical implications for everyday life. Imagine a robot that can truly understand its environment. For example, think of a robot in a warehouse. It could learn to pick up unfamiliar items and place them correctly, even if it hasn’t seen them before. This improves efficiency and reduces errors.

Google DeepMind emphasizes three core qualities for useful AI robotics:

  • Generality: Robots can adapt to new, unseen situations and objects.
  • Interactivity: Robots can understand and respond quickly to instructions.
  • Dexterity: Robots can perform precise tasks with their “hands” and “fingers.”

How might these advancements change your daily routines or work life in the coming years? As mentioned in the release, these models represent “a substantial step in performance on all three axes, getting us closer to truly general purpose robots.” This means less specialized, more adaptable robots. Your future smart home devices could gain much greater physical autonomy.

The Surprising Finding

One particularly interesting aspect of Gemini Robotics is its ability to generalize to novel situations. The technical report explains that the model can solve a wide variety of tasks “out of the box.” This includes tasks it has never encountered during its training phase. This challenges the common assumption that AI must be explicitly trained on every possible scenario to perform effectively. Instead, Gemini Robotics leverages Gemini’s deep world understanding. This allows it to adapt to new objects, diverse instructions, and unfamiliar environments with surprising ease. This means less pre-programming and more real-time adaptability for robots.

What Happens Next

Google DeepMind plans to continue exploring and developing these models’ capabilities. They aim to move them further along the path to real-world applications. We can anticipate seeing more detailed demonstrations and pilot programs within the next 12-18 months. For example, expect to see humanoid robots, like those from Apptronik, showcasing enhanced dexterity and understanding in controlled environments. This could include tasks such as assembling complex products or assisting in healthcare settings.

For readers, it’s wise to keep an eye on developments in robotic interaction. Consider how these more general-purpose robots might integrate into your industry. The team revealed they are working with a selected number of trusted testers. This collaboration will guide the future creation of Gemini Robotics-ER. This iterative process suggests a careful, measured rollout of these new capabilities into various sectors.

Ready to start creating?

Create Voiceover

Transcribe Speech

Create Dialogues

Create Visuals

Clone a Voice