In a stunning achievement, Google’s latest AI model, Gemini 2.5 Pro, has accomplished something extraordinary beating a 29-year-old video game. The game in question? Pokémon Blue, a classic title that has captivated generations of players. Google CEO Sundar Pichai proudly announced the milestone on X (formerly Twitter) with the message, “What a finish! Gemini 2.5 Pro just completed Pokémon Blue!”
While the achievement was significant, it’s important to note that the livestream that showcased this accomplishment was created by a 30-year-old software engineer, Joel Z, who is unaffiliated with Google. Despite not being directly involved, Google executives have shown great support for the effort.
The Road to Victory: How Gemini 2.5 Pro Managed to Beat Pokémon Blue
Gemini’s journey to beating Pokémon Blue didn’t come without help. While the AI model showed great progress, it relied on a combination of agent tools and human guidance. The livestream was broadcasted on Twitch under the banner “Gemini Plays Pokémon,” and the game’s completion was far from a solo effort by Gemini.
Logan Kilpatrick, the product lead for Google AI Studio, had hinted at Gemini’s growing success in a previous post, saying the AI had “earned its 5th badge” and was outperforming other models, including Claude, which was making progress in Pokémon Red. Kilpatrick’s tweet even jokingly referred to the AI’s progress as a “work-in-progress Artificial Pokémon Intelligence.”
The goal of these AI models is to complete the Pokémon games, but with a twist: they need to navigate a complex world of decision-making, problem-solving, and strategy, not just through pre-programmed instructions but by adapting in real-time based on the situation at hand.
Why Pokémon? The Inspiration Behind Gemini’s Challenge
So, why did Google choose Pokémon Blue as the challenge for its Gemini model? The answer may lie in the growing interest of AI systems competing in classic video games. In February, Anthropic’s Claude AI also made headlines for its progress in completing Pokémon Red. In a post, Anthropic noted that the AI’s “extended thinking and agent training” gave it a boost in tackling unexpected tasks, like playing the iconic Game Boy game.
Both Pokémon Blue and Pokémon Red are part of the original Pokémon franchise, which was first released in 1996. These versions are beloved by fans and have become iconic, with Pokémon Red often considered the more difficult of the two. This gave both Claude and Gemini a unique challenge that required more than just simple AI calculations—it required adaptability, patience, and strategic thinking.
Joel Z, the creator behind the livestream, has also emphasized that comparing Gemini to other AI models, such as Claude, isn’t a straightforward exercise. “Please don’t consider this a benchmark for how well an LLM can play Pokémon,” Z cautioned on his Twitch stream. He explained that while both AI models are remarkable, they each have their own set of tools and inputs, which makes direct comparisons challenging.
AI Help: The “Agent Harness” That Made Gemini’s Victory Possible
Both Gemini and Claude rely on external tools to help them succeed in the game. The AI models don’t simply watch the screen and react; they receive additional information, such as screenshots overlaid with data, to help them make decisions in the game. This process is known as “agent harnessing,” and it plays a crucial role in guiding the AI through complex game scenarios.
For example, Gemini used an agent harness to access important in-game information and decide on the best course of action. This system helps Gemini respond effectively to the challenges of Pokémon Blue, such as battling other trainers and collecting in-game items, without relying on pre-programmed solutions or direct human assistance.
Despite these interventions, Joel Z is adamant that this is not cheating. In fact, he describes his role as improving Gemini’s reasoning and decision-making abilities. “My interventions improve Gemini’s overall decision-making and reasoning abilities,” Z explains. “I don’t give specific hints—there are no walkthroughs or direct instructions for particular challenges like Mt. Moon.”
Instead, Z’s help is subtle: he may guide Gemini to solve problems it encounters in the game, such as explaining how to obtain the Lift Key from a Rocket Grunt, which was a bug that was later fixed in Pokémon Yellow. This kind of assistance is designed to sharpen the AI’s skills rather than to spoon-feed it the answers.
Ongoing Development: Gemini Plays Pokémon Continues to Evolve
The journey to complete Pokémon Blue is not over for Gemini. As Joel Z points out, the system is still in development, with ongoing improvements to its capabilities. “Gemini Plays Pokémon is still actively being developed, and the framework continues to evolve,” he said, suggesting that future updates could enhance Gemini’s performance even further.
For now, the victory is a major milestone for Google’s AI division. While Gemini’s completion of Pokémon Blue may not be a perfect, unassisted feat, it demonstrates just how far AI has come in its ability to tackle complex, unpredictable tasks. Whether or not Gemini is the best at playing Pokémon remains to be seen, but its achievement is a powerful reminder of the potential of modern AI systems.
As AI continues to improve, it’s clear that the possibilities are endless. Whether it’s competing in video games, enhancing decision-making in real-world scenarios, or revolutionizing industries like healthcare and finance, the lessons learned from projects like Gemini Plays Pokémon are paving the way for more sophisticated, capable AI systems.
For now, though, fans of both Pokémon and artificial intelligence can sit back and enjoy this unique moment in tech history. Google’s Gemini may have just completed Pokémon Blue, but its journey in the gaming world has only just begun.