Google AI Masters Minecraft in 9 Days: DreamerV3 Mines Diamonds Without Hints
Scientists from Google DeepMind and the University of Toronto have unveiled DreamerV3, an algorithm that mastered Minecraft in just nine days, autonomously learning to mine diamonds without any hints. This development marks a significant step towards creating a universal artificial intelligence capable of tackling tasks of any complexity, from robot control to data analysis, using a unified set of rules.
DreamerV3 operates through an «internal simulator»—a neural network that predicts the consequences of actions, much like a chess player anticipates moves. By visualizing a virtual scenario, the algorithm evaluates its value with a second network (the «critic») and selects the optimal solution using a third (the «actor»). For example, to find diamonds in Minecraft, the system sequentially learns to chop wood, craft a pickaxe, and explore caves—all without external guidance, driven solely by its intrinsic curiosity to achieve the goal.
DreamerV3’s key advantage is its stability. Unlike traditional algorithms such as PPO, which require fine-tuning for each task, it maintains the same parameters across all 150 test scenarios, ranging from arcade games to robot control. This is achieved through data balancing: the system automatically adjusts the scale of rewards, preventing biases in learning. For instance, in environments where rewards are scarce (like diamonds in Minecraft), the algorithm amplifies «curiosity,» exploring more options.
The breakthrough came with its Minecraft performance: DreamerV3 was the first to gather diamonds «from scratch,» completing 12 stages—from chopping trees to finding rare minerals. This demonstrates its ability for long-term planning in unpredictable conditions, crucial for real-world robots operating in dynamic environments.
Scientists envision the technology’s future in combining learning with internet videos—for example, a robotic assistant could acquire skills by observing humans. DreamerV3 is already scalable: the more computational resources, the higher its efficiency. This paves the way for systems that not only perform tasks but also adapt to new challenges—from smart factories to autonomous vehicles capable of «thinking» ahead.