Google DeepMind reveals Genie 3 for creating interactive 3D worlds

Cosmico - Google DeepMind reveals Genie 3 for creating interactive 3D worlds
Credit: Genie 3/Google DeepMind

Google DeepMind has announced the next major evolution in its AI-generated world modeling efforts with the release of Genie 3, a powerful new system that creates interactive 3D environments in real time. Designed for use by AI agents and human users alike, Genie 3 offers longer, more coherent interactions and memory-like persistence—bringing us one step closer to truly immersive AI-generated virtual spaces.

What Are World Models?

World models are a form of AI that simulate interactive environments, often used to train AI agents, power educational simulations, or fuel new forms of entertainment. Rather than being manually designed by developers, these worlds are generated on-the-fly based on user prompts, allowing for rapid prototyping and highly adaptable experiences.

Google’s interest in this space is growing quickly. Genie 2, released in late 2024, allowed users to generate simple interactive environments from images. Now, Genie 3 significantly expands the scope and realism of these experiences.

What’s New in Genie 3

Compared to its predecessor, Genie 3 offers major advancements:

  • Longer Interaction Time: Users can now explore AI-generated worlds for several minutes, up from Genie 2’s 10–20 seconds.
  • World Memory: The model can now remember what’s in the environment, retaining object positions and textures for about a minute—even when you look away and return.
  • Higher Quality Visuals: Worlds now run at 720p resolution and 24 frames per second, offering smoother and more realistic motion.
  • Promptable World Events: Users can change weather, add characters, or trigger in-world events simply by typing a prompt.

These improvements make Genie 3 feel less like a brief tech demo and more like an early prototype of dynamic AI-driven game engines.

How It Works

To generate a world, a user simply provides a text prompt describing the environment they want—like “a jungle during a thunderstorm” or “a classroom with a chalkboard and rain outside.” Genie 3 then creates a playable 3D space in real time.

The model can recognize and retain spatial features (e.g., writing on a chalkboard remains legible if you walk away and return), though Google notes that readable text is still best generated by explicitly including it in the input prompt.

Still a Work in Progress

Despite its advances, Genie 3 is not yet ready for public release. Google DeepMind is rolling it out as a limited research preview, accessible only to a small group of academics and creators. This approach allows the company to better assess potential risks and refine safeguards before wider deployment.

Limitations include:

  • Restricted user interaction: There are boundaries on how much users can manipulate environments.
  • Text rendering inconsistencies: Reliable generation of written text remains a challenge.
  • No public access yet: General users can’t try the model, though Google is “exploring” broader testing in the future.

Why It Matters

Genie 3 represents a significant leap in real-time AI simulation, moving closer to AI systems that can generate persistent, explorable virtual worlds—a potential breakthrough for gaming, robotics training, and virtual education.

This also places Google DeepMind in direct competition with other world modeling efforts, including OpenAI’s upcoming Sora video generation tool and new interactive environments from startups backed by Pixar cofounder Ed Catmull.

The Future of Interactive AI Worlds

With Genie 3, Google DeepMind is staking its claim in the emerging frontier of AI-generated virtual spaces. The promise is compelling: dynamic, user-generated 3D worlds that can evolve, remember, and respond to human and machine interactions.

As access broadens and the technology matures, Genie 3 could redefine not only how we experience virtual environments—but how AI learns to navigate and understand the real world.

Read more