Google DeepMind shared on Thursday a analysis preview of SIMA 2, the following era of its generalist AI agent that integrates the language and reasoning powers of Gemini, Google’s giant language mannequin, to maneuver past merely following directions to understanding and interacting with its surroundings.
Like lots of DeepMind’s initiatives, together with AlphaFold, the first model of SIMA was educated on a whole lot of hours of online game information to discover ways to play a number of 3D video games like a human, even some video games it wasn’t educated on. SIMA 1, unveiled in March 2024, may comply with fundamental directions throughout a variety of digital environments, however it solely had a 31% success price for finishing complicated duties, in comparison with 71% for people.
“SIMA 2 is a step change and enchancment in capabilities over SIMA 1,” Joe Marino, senior analysis scientist at DeepMind, mentioned in a press briefing. “It’s a extra normal agent. It could possibly full complicated duties in beforehand unseen environments. And it’s a self-improving agent. So it might probably really self-improve based mostly by itself expertise, which is a step in the direction of extra general-purpose robots and AGI methods extra usually.”

SIMA 2 is powered by the Gemini 2.5 flash-lite mannequin, and AGI refers to synthetic normal intelligence, which DeepMind defines as a system able to a variety of mental duties with the power to study new abilities and generalize information throughout completely different areas.
Working with so-called “embodied brokers” is essential to generalized intelligence, DeepMind’s researchers say. Marino defined that an embodied agent interacts with a bodily or digital world by way of a physique – observing inputs and taking actions very similar to a robotic or human would – whereas a non-embodied agent would possibly work together together with your calendar, take notes, or execute code.
Jane Wang, a analysis scientist at DeepMind with a background in neuroscience, instructed TechCrunch that SIMA 2 goes far past gameplay.
“We’re asking it to truly perceive what’s taking place, perceive what the person is asking it to do, after which be capable to reply in a commonsense method that’s really fairly tough,” Wang mentioned.
Techcrunch occasion
San Francisco
|
October 13-15, 2026
By integrating Gemini, SIMA 2 doubled its predecessor’s efficiency, uniting Gemini’s superior language and reasoning talents with the embodied abilities developed by means of coaching.

Marino demoed SIMA 2 in No Man’s Sky, the place the agent described its environment – a rocky planet floor – and decided its subsequent steps by recognizing and interacting with a misery beacon. SIMA 2 additionally makes use of Gemini to purpose internally. In one other recreation, when requested to stroll to the home that’s the colour of a ripe tomato, the agent confirmed its considering – ripe tomatoes are pink, subsequently I ought to go to the pink home – then discovered and approached it.
Being Gemini-powered additionally means SIMA 2 follows directions based mostly on emojis: “You instruct it 🪓🌲, and it’ll go chop down a tree,” Marino mentioned.
Marino additionally demonstrated how SIMA 2 can navigate newly generated photorealistic worlds produced by Genie, DeepMind’s world mannequin, accurately figuring out and interacting with objects like benches, bushes, and butterflies.

Gemini additionally allows self-improvement with out a lot human information, Marino added. The place SIMA 1 was educated completely on human gameplay, SIMA 2 makes use of it as a baseline to offer a robust preliminary mannequin. When the workforce places the agent into a brand new surroundings, it asks one other Gemini mannequin to create new duties and a separate reward mannequin to attain the agent’s makes an attempt. Utilizing these self-generated experiences as coaching information, the agent learns from its personal errors and step by step performs higher, primarily instructing itself new behaviors by means of trial and error as a human would, guided by AI-based suggestions as an alternative of people.
DeepMind sees SIMA 2 as a step towards unlocking extra general-purpose robots.
“If we consider what a system must do to carry out duties in the actual world, like a robotic, I feel there are two parts of it,” Frederic Besse, senior employees analysis engineer at DeepMind, mentioned throughout a press briefing. “First, there’s a high-level understanding of the actual world and what must be completed, in addition to some reasoning.”
If you happen to ask a humanoid robotic in your home to go test what number of cans of beans you’ve got within the cabinet, the system wants to know all the completely different ideas – what beans are, what a cabinet is – and navigate to that location. Besse says SIMA 2 touches extra on that high-level conduct than it does on lower-level actions, which he refers to as controlling issues like bodily joints and wheels.
The workforce declined to share a selected timeline for implementing SIMA 2 in bodily robotics methods. Besse instructed TechCrunch that DeepMind’s lately unveiled robotics basis fashions – which may additionally purpose concerning the bodily world and create multi-step plans to finish a mission – had been educated in another way and individually from SIMA.
Whereas there’s additionally no timeline for releasing greater than a preview of SIMA 2, Wang instructed TechCrunch the objective is to indicate the world what DeepMind has been engaged on and see what sorts of collaborations and potential makes use of are doable.