Google DeepMind’s new Gemini robots adapt to complex tasks like laundry sorting and recycling

Google DeepMind has unveiled a new generation of artificial intelligence models designed to boost reasoning capabilities in robotics, moving robots closer to handling everyday tasks with human-like problem-solving.
The latest systems, called Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, are designed to help robots complete multi-step activities by “thinking” before they act. These vision-language-action models combine different inputs—visual, textual, and contextual—and translate them into physical action. According to DeepMind, this allows robots to go beyond following single commands to tackling more complex, real-world scenarios such as sorting laundry, packing bags, and recycling rubbish.
Moving Beyond Single Instructions
“Models up to now were able to do really well at doing one instruction at a time,” said Carolina Parada, senior director and head of robotics at Google DeepMind. “We’re now moving from one instruction to actually genuine understanding and problem solving for physical tasks.”
A demonstration showed a robot sorting laundry into baskets based on color, while another packed a beanie into a bag for a trip. When asked to prepare for London weather, the robot checked forecasts online and added an umbrella. In another case, it identified local recycling rules for San Francisco via a web search before sorting rubbish into the correct bins.
From Gemini 2.0 to Real-World Tasks
The new models build on earlier versions released in March, which were powered by DeepMind’s Gemini 2.0 system. Those earlier models enabled robots to adjust to changing environments, respond to voice instructions, and perform fine motor tasks such as folding paper or unzipping bags.
The latest release goes further by integrating online resources—like search engines—to extend robots’ knowledge and reasoning. This ability to combine planning with real-world information, experts suggest, could mark a major step toward more general-purpose robots.
Toward a “ChatGPT Moment” for Robotics
Ingmar Posner, professor of applied AI at the University of Oxford, said training robotics models on internet-scale data could spark a “ChatGPT moment” for the field. However, not all experts agree on what constitutes progress. Angelo Cangelosi, co-director of the Manchester Centre for Robotics and AI, cautioned: “It’s just discovering regularities between pixels, between images, between words, tokens, and so on. It’s not real thinking.”
Tackling the Training Bottleneck
A major bottleneck in robotics has been the lack of training data. Unlike language models that can be trained on vast amounts of text from the internet, robots traditionally require painstaking, robot-specific training. DeepMind’s new approach introduces a technique called motion transfer, which allows skills designed for one type of robot—such as a robotic arm—to be applied to another, like a humanoid robot.
“Unlike large language models that can be trained on the entire vast internet of data, robotics has been limited by the painstaking process of collecting real [data for robots],” said Kanishka Rao, principal software engineer at Google DeepMind.
Remaining Challenges
Despite the breakthroughs, DeepMind acknowledges hurdles ahead. Robots must become more dexterous, reliable, and safe before they can be deployed widely in human environments. One area of focus is enabling robots to learn by watching humans perform tasks—an intuitive way people acquire skills that remains difficult for machines.
“One of the major challenges of building general robots is that things that are intuitive for humans are actually quite difficult for robots,” Rao said.
The Bigger Picture
The unveiling of Gemini Robotics 1.5 comes amid a race among major tech groups—including OpenAI and Tesla—to integrate advanced AI into robotics. These efforts aim to create versatile machines that could transform industries ranging from healthcare to manufacturing.
With the latest advances, DeepMind has brought robotics closer to that vision: robots capable not only of executing commands but also of reasoning through multi-step problems—blending planning, perception, and real-world knowledge into action.