AI Dedicates 7,000 Hours to Conquer Pokémon Red’s First Gym, Yet Eludes Discovery of the Second Gym Despite 50,000 Hours of Effort.
A dedicated programmer recently embarked on a remarkable experiment, subjecting an AI model to a staggering 50,000 hours of training in the art of playing Pokemon Red. The result was an algorithm capable of navigating the early stages of the game, strategically assembling a team to vanquish the initial gym leader. However, it fell short in more complex tasks like navigating Mt. Moon and exhibited a curious penchant for repeatedly purchasing the less-than-stellar Magikarp. This endeavor offered an intriguing glimpse into the inner workings of machine learning.
This AI, as demonstrated in a comprehensive video by Peter Whidden, interacted with the game using standard control inputs on an emulator, mirroring the actions of a human player. Learning sessions, each totaling two hours of in-game time, were expedited to approximately six minutes of real-time due to emulation acceleration. This process was further hastened by running 40 concurrent testing sessions.
Since the AI lacks inherent motivation to succeed in a video game, Whidden established specific goals to incentivize its progress. The AI earned reward points for uncovering new elements within the game, identified by distinct on-screen pixel variations. This approach led to some peculiar behaviors, like the AI becoming entranced by the subtle animations of water. Nevertheless, it generally encouraged the AI to progress from Pallet Town through Viridian Forest and onward to Pewter City, home to the inaugural gym battle against Brock.
To foster more well-rounded gameplay, the AI needed additional rewards and penalties. Initially focused solely on exploration, it avoided battles and capturing Pokemon. To address this, Whidden implemented a reward system based on the cumulative level of the AI’s active Pokemon party. This successfully motivated the AI to engage in battles and capture new Pokemon. However, it inadvertently led the AI to deposit some of its Pokemon when visiting a Pokemon Center, significantly reducing the party’s overall level and causing distress for the AI. This aversion to Pokemon Centers persisted until Whidden adjusted the reward system.
Since the AI essentially engaged in random actions until it stumbled upon rewarding strategies, facing Brock presented a considerable challenge. Overcoming Brock’s rock-type Pokemon, which are vulnerable to certain elemental weaknesses, proved difficult for the AI. It wasn’t until a specific iteration, where the AI’s Squirtle exhausted all move power points except for Bubblebeam, that the algorithm grasped how to conquer the gym leader.
Interestingly, while the AI struggled with concepts that come naturally to human players, it rapidly acquired more esoteric knowledge. Whidden noticed that the AI consistently followed a seemingly nonsensical path from Pallet Town to its initial encounter with a wild Pokemon. It later became clear that this precise sequence of inputs guaranteed capturing the wild Pokemon with a single Pokeball throw. Remarkably, the AI spontaneously learned the art of RNG manipulation that speedrunners typically spend years mastering.
Although vanquishing Brock represented a logical culmination of the project, Whidden allowed the AI to continue further. It made substantial progress within Mt. Moon but was ultimately stymied by the labyrinthine and monotonous passages, never reaching the second gym at Cerulean City.
One aspect that the AI surprisingly enjoyed was purchasing Magikarp from the dubious salesman, known for peddling the game’s worst Pokemon at exorbitant prices. For the AI, this purchase served as a swift method to bolster its party with five additional levels, making it the most cost-effective deal in the game. Astonishingly, the AI bought the Magikarp over 10,000 times.
In a final whimsical twist, the AI captured a Rattata at one point and bestowed upon it the name ‘AI.’ In a world of algorithms and randomness, sometimes, things fall into place with a peculiar and delightful precision.
Subscribe to our email newsletter to get the latest posts delivered right to your email.